Lambda Calculus: A gentle introduction and broader context

I TA’d a course on programming languages in Fall 2019, and consistently got questions about lambda calculus. The most important questions weren’t about the mechanics and grammar, but the bigger picture: “Why are we expressing programs in this convoluted and tedious way? What’s the point? What’s all of this for?” This post will address those broader concerns, provide a crash course in lambda-calculus thinking, and hopefully put lambda calc in context.

Lambda calculus is the smallest, simplest version of a programming language we can come up with. We can prove that we can do all kinds of things in lambda calc, even if it’s very tedious and unpleasant. Then we show that if we can implement a lambda calc interpreter in a given language (like Python), that language must be able to do everything that lambda calc can do. In this way lambda calculus is similar to Turing machines, in that if a language can simulate any arbitrary turing machine then it is “turing-complete” and can do everything turing machines can do.

We’ll do a brief example of the idea, without using lambda calculus syntax, which has a bit of a learning curve.

Lambda Calc: Functions

In lambda calculus we have functions that take exactly one argument. There are no variables except constants, no functions of multiple arguments, no constructs like lists or tuples or objects. Just functions of one argument, and some rules about how variable scope works. However, functions are able to create and return other functions.

If lambda calc had a concept of numbers and addition (it does not, but we can create such things) a Python-like syntax might look like:

def add(x):
        def add2(y):
                return x + y
        return add2

# Equivalent lambda-calc syntax
# \x.\y.x+y

We can now use the above like (add(2) 3) to add 2 and 3 and get back 5. This is obviously logically equivalent to a function of two arguments, like add(2, 3). Therefore, we’ve simulated functions of multiple arguments using only functions of one argument.

Lambda Calc: If-statements

Next let’s define true and false and build an if-statement, since we can’t express a lot of important logic without an if-statement.

def true(a):
        def true2(b):
                return a
        return true2

def false(a):
        def false2(b):
                return b
        return false2

# Equivalent lambda-calc syntax
# True:  \a.\b.a
# False: \a.\b.b

If we think of these as functions with two arguments, true(a,b) always returns the first argument, and false(a,b) always returns the second argument. This seems like an awkward definition, but it comes in handy when we define an if-statement:

def if_then_else(boolean):
        def if2(true_path):
                def if3(false_path):
                        return (boolean(true_path) false_path)
                return if3
        return if2

# Equivalent lambda-calc syntax
# \p.\a.\b.p a b

We can think of this as a function that takes three arguments, where the first argument is either “true” or “false”, using our previous definition. The usage works like this:

if_then_else(true, 1, 3) returns 1

if_then_else(false, 1, 3) returns 3

Okay, that looks a little more familiar! The internal representation of true/false might still feel strange, but now we have functions with multiple arguments, and we have if-statements. This language is already a lot more useful than what we started with.

Lambda Calc: Lists

Let’s tackle something more complicated: Building a list. We’ll start a little smaller and build a pair, like (2,6):

def mkpair(a):
        def mkpair2(b):
                def access(first_or_second):
                        return ((if_then_else(first_or_second) a) b)
                return access
        return mkpair2

# Equivalent lambda-calc syntax
# \a.\b.\c.c a b

In plain English, this says “take three arguments. If the third argument is true, return the first argument. If the third argument is false, return the second argument.” We’ll use “true” and “false” as the third parameter to accomplish (2,6).first or (2,6)[0] depending on the kind of language you’re used to writing in. We can even write some helper functions to make this interface friendlier:

def first(pair):
        return (pair true)

def second(pair):
        return (pair false)

# Equivalent lambda-calc syntax
# First:  \p.p (\a.\b.a)
# Second: \p.p (\a.\b.b)

Now we can use pairs like:

def addAPair(pair):
        return (add(first pair)) (second pair)

def createAndAddPair(x):
        return addAPair( ((mkpair 2) 6) )

A lot of parenthesis to sort out, but by only passing the first two arguments to mkpair (i.e. (mkpair 2) 6)) we create a pair of elements, and we can call first or second on this pair to extract the two halves. Great!

Now we can generalize to a list of arbitrary length, by defining a list as a pair of pairs. In other words, we can represent a concept like [1,2,3,4,5] as pairs like (1,(2,(3,(4,5)))). It’s tedious, but we can obviously encode data of any length this way.

What if we want to write a function that can process this list? How can we convey that a list has X elements in it? Well, there are a few design choices we can make. One option is including the length as the first element of the list:


This gets a little messy, since any function that’s processing the list will have to peel the first two elements off to get any data, and will have to pre-pend a new first element to update the length, like:





Alternatively, we could include booleans at each layer to indicate whether the list has more elements. This might look something like:

(true, (1, (true, (2, (false, 3)))))

This may be easier to work with since we don’t need to continue updating the list length every time we remove an element, but it also requires storing twice as many items to express the data in the list.

Returning to Concept-Land

We started with a language that only contains functions of one variable, and using this concept alone we’ve created functions of multiple arguments, boolean logic, if-statements, pairs, and lists. So much from so little!

This is precisely the reason lambda calculus exists. From an academic perspective, it’s much easier to reason about programming and write formal proofs when the building blocks are small and simple, and we’ve shown you need very little to have a capable programming language.

From an engineering perspective, lambda calculus is a useful concept for thinking about how logic can be translated from one language to another. An imperative language like C or Python works very differently than a functional language like Haskell, or a logic language like Prolog. But if we can express our thinking in C or Python in terms of functions and lists, and we can express functions and lists in lambda calculus, then those same ideas can be expressed in Haskell or Prolog, even if the implementation looks radically different.

Lambda calculus provides a baseline for what programming languages can express, and how ideas can be translated between languages. This kind of thinking (along with lambda calculus’ variable-scoping rules, which we haven’t talked about in this post) form some of the early foundations of compiler and interpreter design.

Posted 3/22/20

Consensus Models in the Pursuance Paradigm

I’ve written a lot recently about defining group dynamics in Pursuance. I’ve outlined a role-based system for describing permissions and responsibilities. So far, however, the role language has been limited to describing rather anarchic systems: Anyone with a role assumes the full powers of the role and can act independently of anyone else with the role. While this is sufficient for describing many organizational structures, especially smaller and short-lived ones, it falls short of describing collective decision making. This post will discuss a few broad categories of collective action, and methods of extending the Pursuance role language proposed in previous posts to describe group decision-making.

Collective Action Styles

Very broadly speaking, there are two categories of group self-governance. In the first, decisions are made by majority vote, as in democracratic and parliamentary systems. There may be varying vote cutoffs depending on the action proposed, and different ways of counting the vote (plurality, ranked-choice, first-past-the-post, …), but the fundamental structure is “the group does whatever the most people want.” There’s a lot of complexity, advantages, and drawbacks of various parliamentary-like systems, but they’re out of scope for this post. Our goal is to enable groups to choose their own organizational structure and define it within Pursuance.

In the second category, decisions are made by global consensus. This can mean that the entire community votes on all decisions, but more commonly the group delegates decisions on certain topics to sub-groups who obtain internal consensus, as in Clusters & Spokes Councils.

Collective Role Permissions

We can describe collective action with a simple language primitive. Here we describe a “member” role, where users holding this position can kick any other member if they can get two additional members to agree with them:

member {
    consensus 3 {
        kick member

We can also describe consensus as a percentage of the users holding the role. Here we create a journalist role, where all journalists can agree to bring another into the fold:

journalist {
    consensus 100% {
       invite journalist

Group Decision Interface

What does consensus look like in the UI of a platform like Pursuance? It can appear like another task, only visible to users within the role making the decision, with an added function of “approve/disapprove”. Unlike normal tasks, which are closed when someone with the authority to ends it, decision tasks are closed automatically when enough users vote to approve or dissaprove the decision.

Since group decisions are implemented as tasks, they implicitly provide a discussion space about the decision being made.

If blind voting is desired, we can substitute “secret-consensus” in place of “consensus”. In fact, it might be clearer if the language is “public-consensus” and “secret-consensus” to make the visibility of votes unambiguous at all times.

The proposer is always public, even in secret consensus. This is akin to someone calling for a vote of no-confidence: The votes may be secret, but someone publicly called for the vote. This is beneficial because it prevents an abusive scenario where one person creates hundreds of secret consensus actions and grinds the structure to a halt.

A Tiny Parliament

Below is a miniature social structure for a parliamentary organization with two office positions and a role for general body members:

president {
   # Presidential powers here
   unassign role president

treasurer {
	# Treasurer powers here
	# Maybe the role has no powers, and exists to provide 0auth access to
	# budget spreadsheets and financial accounts

member {
   consensus 50% {
       assign role president
       assign role treasurer
   consensus 90% {
       unassign role president

Via a 50% vote of all general members, the membership can elect a new president or treasurer. With a 90% consensus, members can pass a vote of no-confidence and evict a president.

A Tiny Consensus Group

An organization more akin to the Quakers or Occupy Wall Street, with an affinity for clusters and spokes councils, may want to distinguish operations any member can do, and operations only total consensus can achieve:

press-committee {
    # Has the right to represent the organization to the outside

member {
   consensus 100% {
       assign role press-committee
   unassign role press-committee

In the above system, the entire community must agree to add a member to a committee with higher authority, and any member can revoke that user’s privileges, ensuring the total membership consents to the decisions being made by that committee.

Ongoing Work

The consensus model outlined above has clear limitations. There’s no room for decisions requiring consent from multiple roles. The way a proposal is rejected is a little unclear unless we use percentages (does a consent 3 fail if any 3 members reject the proposal? Does it only fail when enough users reject the proposal that there are not 3 remaining that could pass the proposal?). Nevertheless, this relatively simple proposal dramatically increases the types of organizations we can represent, and therefore what kinds of groups can effectively organize on Pursuance.

Posted 2/9/20

The Soviet Cybernetic Economy

The Soviet Union designed and redesigned a national computer network that would deliver all economic information to a central database, input it into a feedback loop cybernetic economic model, and make autonomous decisions about resource allocation and production subject to broad objectives set by party leadership.

So that’s a fun opening. A 2008 paper titled “InterNyet: why the Soviet Union did not build a nationwide computer network” details the early 1960’s project, with an emphasis on the social history and political context in which it was ultimately crushed. In this post I’ll write briefly about the project, my own take-aways from the paper, and how this informs related ongoing efforts like the Pursuance Project.

The Academic Field

Cybernetics classically* refers to a study of communications and automatic control systems, usually recursive feedback loops where the output or result of the system is also an input determining the further behavior of the system. This study of recursive systems refers both to artificial constructs, like a thermostat using the temperature to determine whether to activate the heater to adjust the temperature, and living constructs, from ecological self-regulation between predators and prey, to societal behavior.

On the Eastern side of the Cold War, cybernetics was particularly applied to economics, and the creation of computer models where a combination of economic output and public demand drives economic creation.

* The modern understanding of cybernetics referring to cyborgs and biotechnology is derived from classical cybernetics, but is quite distinct.

The Communist Opportunity

One of the primary differences between communism and other political theories is a public-controlled economy. Rather than independent corporations choosing what to produce and in what volumes, a government agency (ostensibly representing the will of the people) assigns resources and quotas, and the factories produce what the state requests. This model is frequently imagined as a centrally-controlled economy (i.e. Moscow decides what is to be produced throughout the Soviet Union), but through most of the history of the U.S.S.R. there were a number of agencies based on either physical location or shared industry that directed the economy, with little top-level collaboration.

The difficulty is in bureaucratic scale. Ideally, a central agency with perfect knowledge of the public’s consumption and desires, and the resources and production levels across the country, could make mathematically-optimal choices about how much of what to produce where and have economic output far more responsive, with far less waste, than in a capitalist model. After all, competing corporations do not share information with one another about their sales or upcoming production, necessarily leading to conflicting choices and waste. Unfortunately, collecting sales information from every store, manifests from every warehouse, production output from every factory, and collating it at a central governing body takes immense resources by hand. Making decisions based on this information requires a small army of managers and economists. It is no wonder the Soviet Union opted for localizing decision-making, sharing only high-level information between agencies to limit overhead. Then the advent of digital computers and electronic networking promised a chance to change everything.

The Soviet Plan

The Soviet plan* was relatively straight-forward: They had existing cybernetic models for a recursive economy that relied on simulated data, now they have communications technology capable of providing real numbers. Combine the two, and the economic model transforms from a theoretical simulation of academic interest to an active decision-maker, guiding the activities of the entire Soviet economy in real-time.

For the late 60s, early 70s, this was an ambitious plan. Every factory and storefront would need to install a computer and digitize all records. A nation-wide (or at least in major cities and at primary distribution centers) computer network would be installed to send records to a number of central data centers. Communication would run in both directions, so the central computer could send instructions back to the fringes of the community.

Ultimately, however, it was not the technical limitations that doomed the project (remember that the Soviets successfully built space and nuclear programs), but political ones. Turning over all minute economic decisions to the computer would eliminate a wide number of bureaucratic posts - the same bureaucrats that were to vote on the implementation of the system. Power struggles between different ministries ensured the full plan would never be deployed. Instead, each ministry implemented a subsection of the plan independently, digitizing their own records and networking their computer systems, with no cross-networking or any serious attempt at cross-compatibility. The result solidified existing power structures instead of revolutionizing the nation.

* I am dramatically simplifying here by combining several iterations of project proposals from a number of Soviet cyberneticians, economists, and politicians.

Ongoing Dreams

The core mission of the Soviet project was to automate away bureaucracy, enabling coordination and decision-making at a high scale that would be infeasible with human decision makers. The depth of the hierarchy, amount of information involved, and near real-time response constraints make automation an absolute necessity.

This is fundamentally the same mission as the Pursuance Project, albeit with different motivations: Delegate bureaucracy and administration to the machine, to allow the rapid creation of social groups that traditionally have significant starting costs. Automation has an added boon of providing a constant presence when the membership of the organization shifts.

The problem space for Pursuance is comparatively small: We already have the physical infrastructure for collaboration (the Internet), and since most groups are built around clear short-term objectives there is a limited need for long-term sustainability in any bureaucracy created. Critically, Pursuance does not face the brunt of the Soviet political entrenchment; by focusing on the creation of new activist groups we bypass any sense of “replacing” a person, and only augment what is possible.

Cybernetic models provide an opportunity to expand what work we offload to the machine in a Pursuance, perhaps enabling greater community adaptation and automation by incorporating human choices as inputs in feedback loops chosen by the community. This is all speculation for the moment, but worth further examination as the Pursuance design grows.

Posted 1/30/20

Pursuance Group Dynamics

In the past several posts I’ve written about the technical design of Pursuance in terms of the permissions and communications systems, and how this allows collaboration between different groups on the platform.

This post will take a detour to talk about what groups on Pursuance will look like from a social and bureaucratic capacity, and how work can actually be accomplished with the Pursuance platform.

The Anarchist Model of Institutions

A pursuance is a group of people with shared objectives, and a set of methods for accomplishing those objectives. The pursuance exists so long as those methods make sense, and the objectives remain unfulfilled. Then it disbands.

This suggests that many pursuances will be short-lived and highly specialized. This is the goal. A pursuance with a dedicated and simple purpose has less history, simpler bureaucracy, and shorter on-boarding. There are fewer disagreements and greater enthusiasm, because everyone agrees on their shared purpose and has a clear understanding of their role. The infrastructure is too simple to support much corruption or obfuscation, and anyone who becomes disillusioned can leave and join a more aligned pursuance.

Complex actions are enabled by collaboration between several appropriate pursuances, each adding their own expertise. Many individuals will be part of several pursuances, applying their personal talents to different projects they agree with, and facilitating communication and discovery between pursuances.

Drawbacks to the Anarchist Model

We occasionally see structured micro-organizations as described above in in-person temporary communities, such as the Occupy Wall Street working groups. Less frequently, however, do we see such micro-organizations in online space. There are many micro-communities, which can be as simple as creating a chatroom in Discord, announcing a topic, and inviting people. However, these groups rarely have an explicit purpose, methodology, or decision-making process.

Startup costs are hard. Founding a new organization means agreeing on a decision-making and leadership dynamic, whether through formal bylaws or informal consensus. Building out infrastructure like a wiki, document storage, public websites, source code management, or collaborative editors requires significant sysadmin work to set up, and then ongoing bureaucratic work to make everyone accounts on each service, track who should have what access to what systems, and remove members from each service as they leave the group. There is a strong incentive here to re-use existing infrastructure, and avoid fracturing and creating new groups if possible.

Pursuance Lowers Startup Costs

The Pursuance rules and roles system acts as a primitive kind of bylaws, governing who has what rights and responsibilities within the group. If we provide an example library of pursuance rules then the group can select the closest template to their needs, make some edits, and have working “bylaws” in minutes, including a structure for adding, promoting, and removing members within the pursuance. They also have a task-based discussion forum, so their earliest communications and planning needs are met.

For many groups this will be insufficient, and they will need additional services described above like wikis, academic reference management, or document tagging, that are tailored to the group’s specific goals. Providing all of this functionality in Pursuance would be futile and foolish: There are too many needs, and plenty of well-developed tools that serve those specific purposes better. Pursuance itself follows the same philosophy of focusing on specific objectives with explicit methods, and exists only to facilitate collaboration between individuals in volunteer-driven groups.

However, Pursuance can still help with the administration and maintenance of these external services. Many technologies support an authentication system called OAuth, which allows users to login to a service like Twitter, Facebook, or Google, and use that same login to gain access to something like Wordpress, without telling Wordpress your Facebook password. We can make each pursuance role an OAuth provider, and allow system administrators to configure their pursuance’s infrastructure to use a role’s OAuth.

Using Pursuance for OAuth means everyone has automatic access to whatever systems they need, with virtually no trace of bureaucracy. Anyone in the “developers” role has instant access to the pursuance’s gitlab instance. Anyone in “journalists” can edit the wiki. Administrative onboarding has been reduced to inviting the user to the pursuance, then adding them to the appropriate roles. When a user leaves or is removed from a pursuance, they lose access to all infrastructure, immediately, without an error-prone process of deleting their accounts from every website. Since the pursuance rules control who can add people to particular roles, we effectively have enforceable bylaws governing who can add and remove people from different institutional infrastructure. Pursuance graduates from task-management and group discovery, to automating swathes of administrative work within small groups, freeing members to spin up new organizations for their needs and focus on the work at hand.

Posted 9/9/19

Pursuance Task Management

Pursuances are built around specific goals, or “pursuances”, so it makes sense for the Pursuance platform to be task oriented. Because pursuances tackle a wide variety of topics they will do most of their work off the Pursuance platform using more appropriate tools, including wikis, source code management, shared text editors, and file hosting. Pursuance is a place to organize with other activists, and collaborate with other groups. Therefore we will place an emphasis on collaborative task management, leaving the nature of the tasks as generalizable as possible.

The Task Model

GitHub issues, while designed for tracking planned features and known bugs in software projects, work well for general task management. Lets look at how this idea interacts with the user, role, and permissions systems from previous posts.

Example List of GitHub issues

Tasks have a few attributes:

  • Title
  • Description
  • Date created
  • Status [Open/Closed, Active/Complete]
  • Assigned to: [roles and users]
  • Labels
  • Conversation thread

Tasks can be assigned to multiple roles and users, by anyone with a role granting them power to assign tasks to another role. For example, anyone in the “leader” role below has power to create tasks and assign them to anyone in “worker” role, or the “worker” role as a whole:

leader {
	assign tasks worker

worker {

By default, users can only see tasks that are assigned to them or roles they are in. This can be overridden with the cansee permission, such as:

leader {
	assign tasks worker
	cansee tasks worker

The above allows leaders to see any tasks assigned to workers. We could allow leaders to see all tasks assigned to anyone with cansee task *.

Task Labels

Tasks can be tagged with labels, defined by the pursuance:

GitHub Issue Labels

These labels have no structural or rule-based effect (so far), but are searchable, and provide an organizational level more fine-tuned than assigning to roles.

To implement this idea we need two new permissions:

Attribute Description
assign labels rolename Can set labels for any tasks assigned to a particular role
cancreate labels Can create new labels for use through the pursuance

Organizational Examples

We can build a multi-tier moderator setup, where the project leader can designate organizers, who bring in workers and assign tasks to them. Organizers label those tasks appropriately, acting as managers. This might look like:

project-leader {
	cancreate labels
	assign role organizer
	invite organizer

organizer {
	contact project-leader
	contact worker
	assign tasks worker
	assign labels worker
	invite worker

worker {
	contact organizer

(Note that we don’t have to give ‘project-leader’ all the same permissions as ‘organizer’, because anyone that’s a project-leader can simply assign the organizer role to themselves)

We can also create a simpler, more anarchic setup, where every member can create labels and assign them to any task:

member {
	cancreate labels
	assign labels *
	assign tasks *


Every task has a timeline associated with it, which describes how the task has changed, and includes discussion of the task:

Example GitHub issue timeline

Any user with a task assigned to them (or a role they are in) can comment on a task. They can also close or re-open the issue as they deem appropriate.

This timeline provides a task-based discussion space, organizing the entire pursuance akin to forum threads. This will probably be the primary means of communication for many pursuances, since most discussion will be about particular tasks. By allowing anyone assigned to the task to participate in discussion, we’ve created a private conversation space with only the relevant parties, reducing noise.


A key objective of Pursuance is facilitating collaboration between groups, rather than within groups. Tasks are the main unit for collaboration in Pursuance.

Therefore, tasks can be assigned to roles in other pursuances, so long as those pursuances accept by amending their rules. For example, say pursuances “greenpeace” and “investigating-enron” want to collaborate. Greenpeace might create a role like:

oil-investigators {
	accept tasks environmentalists@investigating-enron
	assign tasks environmentalists@investigating-enron
	assign tasks oil-investigators

While “investigating-enron” would add an equivalent role:

environmentalists {
	accept tasks oil-investigators@greenpeace
	assign tasks oil-investigators@greenpeace
	assign tasks environmentalists

Now anyone with the “environmentalist” role at the “investigating-enron” pursuance can assign tasks to themselves and the “oil-investigators” role at the “greenpeace” pursuance, and vice-versa. We can provide a graphical tool for “creating a new collaboration role” that fills out rules for a role as above, so users do not have to touch the Pursuance language unless they’re doing something especially clever.

This is more nuanced than two groups assigning work to one another. Since any user with a task assigned to them can add a message to the bottom of the task, we have effectively created a shared forum thread between two distinct communities. Users from each community can participate without previously knowing one another or explicitly adding one another to the conversation. We have an ephemeral, task-oriented, shared communication space for an arbitrary number of communities.

Further, this shared space includes only the relevant people from each community, without dragging in the rest of their infrastructure, making accounts on one another’s services, and so on.

In addition, we can create however many of these spaces are necessary through the creation of additional roles: “Here’s a task with our journalists and your journalists”, “here’s another task with our lawyers and your lawyers.” If additional expertise is needed, either pursuance can add more users or roles to the task, per their own pursuance roles.

Design Concerns

There are a few areas of the above proposal that aren’t fully thought through.

Changing Role and Pursuance Names

The above proposal works if we assume that role and pursuance names are constant. It is simplest to enforce constant names, but what happens if a pursuance is deleted? The collaboration rule no applies. What if a new pursuance is created with the same name as the old one? Does this introduce the possibility of attack through old pursuance rules?

Alternatively, we can use IDs, as in Discord. Internally, a rule would look more like accept tasks oil-investigators@<UUID FOR GREENPEACE PURSUANCE>, and when presented to a human we look up the UUID and substitute the current pursuance name. When the rule is written, a human writes the current pursuance name, and the UUID is substituted in. This allows pursuances to change their names without breaking collaboration rules, and prevents collision if a new pursuance with the same name is created after the old one is deleted.

We cannot use the same solution for roles, because the remote role may not exist yet - in the above example, the oil-investigators role is created before the corresponding environmentalists role is created, or vice-versa. Therefore, the rule no longer applies if the remote role is renamed. However, this may be fine - it allows any pursuance to change which roles are collaborating with which pursuances by changing the role names.

Task Visibility

Above we’ve described a cansee rule allowing different roles to see tasks that are not assigned to them. This makes sense in a larger organizational context, but is it unnecessarily complicated? We could allow all members of a pursuance read-only access to all tasks, and only assigned users and roles receive write-access.

This would simplify the structure of a pursuance, and perhaps make the UI simpler. However, it would also emphasize the use of multiple pursuances to a massive degree. For example, a group may have a “general worker” pursuance and a “trusted contributor” pursuance to create a separation of knowledge and trust.

In my opinion the cansee rule is beneficial, and a pursuance can manage trust and information using roles, and create multiple pursuances for separate broad objectives. This is worth having a discussion over.

How are Users Assigned

When a task is assigned to a role, access rights are clear: Anyone currently possessing the role has read and write access to the task, and if they have the role unassigned then they lose access to the task. If a user is part of multiple roles to which the task is assigned, they retain access unless they are removed from those roles, or the task is unassigned from those roles.

If we assign a task to a particular user within a role, this becomes trickier. Do we only track the user that the task was assigned to, and give them access so long as they are a member of the pursuance? Or do we track which role the user received access through, and automatically revoke their access when this role is removed?

I am a fan of the later, and envision a table for assignments akin to:

Task ID user ID role Pursuance ID

This allows us to track what users received what access via what role in what pursuance, and therefore when a user has a role removed we can unassign all applicable tasks with something like:

DELETE FROM task_assignments WHERE userID=X role=Y pursuanceID=Z;

Remaining Design Work

We now have a vision for what a pursuance looks like (a collection of users given various roles), what the main activity within a pursuance is (creating and assigning tasks, commenting on those tasks), and how pursuances collaborate (through creating shared tasks and inviting relevant parties). We’ve outlined a rule system to facilitate task, user, and role management.

This is a good structure once groups are established, but we’re missing the catalytic step: How do users discover and join pursuances? Inviting users into an existing community is effective only when you have an existing social network to build from, but one of the main objectives of the Pursuance Project is to make group discovery and participatory activism trivial.

Next up: How pursuances can present themselves to new users, and how we can visualize how pursuances are interconnected, and what skillsets are needed.

Posted 7/23/19

Pursuance Roles

Pursuance is a collaboration tool for activists, journalists, and other groups seeking positive change. At the heart of Pursuance is task management, information sharing, and communication, focusing on combining everyone’s unique talents and backgrounds, and adapting to the rapidly changing membership that plagues volunteer groups.

This post is a proposal for implementing roles and rules within Pursuance. It is compatible with, but does not require, the previous post on implementing Pursuance over email. This post is a bare minimum framework, and leaves significant room for expanding Pursuance rules as additional functionality is added to the platform.


Each user can have multiple roles within an organization, or pursuance. These roles can be used in:

  • Messaging (they work like a mailing list)
  • Task Assignment (out of scope for this post)
  • Pursuance Rules (discussed below)

This idea is inspired by Discord roles:

Screenshot of Discord Roles

On this chatroom platform, roles give users read or write access to different channels within a server, and a range of moderator powers including creating new roles and channels. Roles on Discord have an explicit hierarchy, determining which roles can assign which other roles, and what color a username appears in when the user has multiple roles.

We want to take this idea and apply it outside a chatroom, representing more flexible relationships than a simple hierarchy. Specifically we want to represent tree structures of who can contact whom, or community clusters with different expertise for reviewing documents, two use-cases discussed in an overview of Pursuance.

Pursuance Rules

A role is a title, combined with a set of permissions describing what users with the role are capable of. Therefore we need to describe Pursuance rules as we define roles.

Pursuance rules mostly describe what members do and do not have permission to do. What permissions exist?

  • Can create and edit roles
  • Can assign roles
  • Can contact another
  • Can invite users to pursuance
  • More as Pursuance develops

Initially pursuances have one user, the creator of the pursuance, with the role “founder”. This role has all permissions, thus allowing them to construct the rest of the pursuance as they see fit.

What do Pursuance Rules Look Like?

We need some syntax for describing pursuance rules and roles. Here’s a first attempt describing a document review system, where some journalists need help analyzing a trove of documents, and confer with experts when appropriate:

journalists {
	contact *
	assign role security-experts
	invite journalists
	invite security-experts

security-experts {
	contact journalists
	invite security-experts

Journalists are allowed to contact anyone, and can invite new members into the pursuance as either journalists or security-experts. They can also designate existing users as security experts.

Security experts can review certain documents on computer security topics. Therefore, they need to be able to communicate their findings back to journalists. They can also invite their peers as fellow security experts. However, they cannot invite users with any other roles, or promote existing users with different roles to security experts.

Creating Rules and Roles

Who can create roles or rules within a pursuance? Initially, only the founder, who has permission to do anything. Do we want to delegate this permission?

At first, delegation seems advantageous - we can allow moderators to refine rules for their community on behalf of administrators, or create regional community leaders who can create new roles for organizing local membership.

However, delegating this authority makes the rule system dramatically more complex. Do we add some kind of limit, like “members with the power to create roles can only give those roles subsets of the authority their own roles have?” What if the user’s permissions change? Does each role have a parent user it receives authority from? A parent role?

That’s a lot of complexity for a use-case that won’t occur often. How large do we expect a pursuance to get? Twenty users? A hundred, for some of the larger communities? How many roles and rules are necessary to administer such a group? Most pursuances will probably be satisfied with five or less roles, and rules that do not change, or rarely change, after group creation. Maybe more roles, if they’re used as simple team labels, but such roles would be boiler plate, used for task assignment and mailing lists only.

Instead, let’s keep this design as simple as possible, and enable complexity through linking pursuances together. Consider a political action group with city, state, and national levels. Instead of creating one massive pursuance with many roles and rules and complex delegation, we can create a tree of pursuances, each with their own organizational structures. Shared members between the groups act as delegates, and allow sharing of information and tasks between pursuances.

From a rule-making perspective, this means we can leave only founders with the power to create and edit roles. If a founder wants to delegate this power, they can appoint other founders.

Common Design Patterns

Expecting everyone to learn a new language to describe their organization’s social structure creates a high barrier to entry, even if the language is simple and easy to learn. Fortunately, this is largely unnecessary.

Instead of starting each pursuance with a blank slate, we can provide a list of organizational templates to choose from. This is pretty similar to what Overleaf does: LaTeX is a complicated language with a steep learning curve, so they provide a library of example LaTeX documents as starting points, dramatically simplifying the process for new users.

Not only does this make Pursuance easier to use, but it provides an opportunity to promote power structures we think are healthy or productive, exposing communities to new ideas.

Below are a handful of simple examples. As we expand the capabilities of pursuance rules, this list should be expanded.

The Chatroom

member {
	contact *
	invite member

To make this a moderated chatroom, we can add a second role:

moderator {
	kick member

The founder can now designate moderators, who have the authority to kick any member out of the pursuance.

Journalism Crowd-Sourcing

journalists {
	contact *
	assign role handlers
	invite journalists
	invite handlers
	invite sources
	kick sources
	kick handlers

handlers {
	contact journalists
	contact sources
	invite sources
	kick sources

sources {
	contact handlers

This creates a 3-stage filtering system, wherein journalists can recruit sources directly or recruit trusted helpers. Sources can present their findings to any handler, who can forward relevant information to the journalists. Handlers act as moderators, and can kick troll-sources or recruit new sources without interaction from journalists.

Additional Rule Attributes

Everything discussed about roles so far is for describing communication boundaries and recruitment of new users. What other attributes might we want to add? Here are some early ideas:

Attribute Description
group addressable Allow users to write to the entire group rather than individuals in it
public membership Make a list of users with this role public (within the pursuance? To people outside the pursuance?)
public tasks Make a list of all tasks assigned to this role
description A human-readable description of the powers and responsibilities of the role
cansee tasks X Can see tasks assigned to role X, or people with role X
cansee tasks * Can see all tasks in the pursuance
cansee files foldername Can see all files in a particular folder
canadd files foldername Can upload new files in a particular folder
canadd tasks rolename Can add new tasks and assign them to a particular role, or users with the role

Conclusion + Future Work

The role system defined above is pretty primitive, and will likely develop over time. However, this is already enough to describe how different people and groups can collaborate, how new users are added to a pursuance and assigned different roles within the organization, and how privacy is enforced.

By placing an emphasis on roles over users, we give a pursuance some flexibility as membership changes. Still missing is the ability to respond dynamically to membership changes. For example, we could add rules to a role such that when someone leaves the pursuance any tasks assigned to them are reassigned to the role at large, or to a random member within the role. This process can also occur automatically for inactive users. There’s some complexity surrounding which role to assign the task to if the user had multiple roles, but that’s for a later post on task management in Pursuance.

Also missing so far is any mention of how information is formally shared between pursuances - shared membership is sufficient for forwarding an email, and we should leverage informal systems like this whenever they are beneficial. However, it would be ideal if we could describe tasks that cross pursuances. These shared tasks would be assigned to different people in each pursuance, and facilitate task-based communication between pursuances, without explicitly merging the groups.

Posted 7/14/19

Pursuance Prototype: Email?

After my previous post I have an abstract understanding of what the Pursuance Project is trying to achieve. What would the technology itself look like? What existing technologies can we build off of to achieve our goals?

As a refresher, we need:

  • A concept of “users”

  • A concept of a group, which users can be a part of, called a “pursuance”

  • A way for users within a pursuance to message one another

  • A concept of “tasks” that can be assigned to users

  • Shared document storage

  • A “role” system for users, describing their expertise or position in the org

  • A permissions system that can be applied to pursuances, users, or roles, describing:

    • Who can contact who

    • What files can be read/written

    • What new permissions can be granted or revoked

Let’s set aside the document storage problem for a moment. The rest of Pursuance is a messaging system, with sophisticated permissions for describing who is allowed to message whom. What existing messaging platforms fit these needs?

We have a few open source messaging technologies to choose from, including IRC, XMPP/Jabber, Keybase (client is OSS, server-side is not), mastadon, and email. Rather than addressing pros and cons of each individually, what do we want out of our chat system?

We want something with an intuitive or familiar UI, and we want something that emphasizes thoughtful communication over banter. This actually rules out most chatroom software like IRC, secure texting replacements like Signal, and Twitter-like platforms like Mastadon. Keybase is attractive due to its inherent encryption, but doesn’t support much in the way of permissions controlling what users can message one another, and is notably a noisy chatroom like Discord or Slack.

What about email? Tools like spam filters control what accounts can email one another all the time, the model is trivially understood by anyone that’s used a computer, and the format is significantly longer-form than text messaging or tweets, hopefully facilitating more thoughtful communication.


Let’s say a Pursuance server is running a classic mail stack, like Postfix and Dovecot. This is a closed system, only accepting mail from Pursuance users and refusing to deliver anything externally, so we have a lot more control over configuration.

The Pursuance client can either be a desktop app or a web app with email functionality. It differs from a standard mail client in that it adds the pursuance as an extra mail header (or maybe as the domain, like @pursuance-name?), to track which pursuance two users are communicating through.

Since Postfix and Dovecot can use a database to retrieve lists of users, we can now have a few SQL tables for tracking login information, what users are in what pursuances, what roles users have in each pursuance, and what rules apply to the pursuance.

We can add a filter to Postfix that calls an external script before accepting or rejecting mail for delivery. This script can be arbitrarily complex, querying SQL, parsing pursuance rules, and ultimately choosing whether or not to deliver the message.

Additional Messaging Functionality

Want to send files between users? Email attachments are implicitly supported.

Auto-deletion of old messages? We can set up a pursuance rule that periodically triggers deletion of old emails.

End to end encryption? There are longstanding PGP standards for encrypting emails with a user-supplied keypair. This is usually tedious to set up, because every user has to install and understand tools like GPG - but if we include pre-configured encryption support in the Pursuance client, this is a non-issue. We can use the Pursuance server as a public keyserver (storing the public keys in SQL), or support using a public keyserver for redundancy.

Decentralizing server hosting? This is still a stretch goal, but email between mail servers is obviously an existing standard, and we can build from there.

Task Management

To organize a pursuance we need a concept of tasks that can be assigned to a user or group of users. With heavy inspiration from Github issues, tasks have the following attributes:

  • Task ID

  • Task Name

  • Task Description

  • Task Status (Unassigned, Assigned, Complete)

  • Assigned to Users (list)

  • Assigned to Tags (list)

All of this can be pretty easily described in an SQL table and hooked up to the existing user management database.

File Sharing

We need a large amount of storage space to store all files for all pursuances. Do we use a big hardware RAID like what’s provided by Digital Ocean? Do we use a more conventional cloud solution, like a paid Google Drive plan? The best answer from a software side is to be implementation-agnostic. We have a big folder on the Pursuance server that we can keep things in. How do we manage it?

Let’s store all files with a UUID, in a directory space like storagedirectory/pursuanceID/fileID

Each file has an entry in the database with the attributes:

  • Pursuance ID

  • File ID

  • File name

  • Parent Folder ID

We can simulate a filesystem by adding “folders” to the database with the attributes:

  • Folder ID

  • Parent Folder ID

  • Folder name

We can now apply pursuance rules to folders, creating a permissions system. We can add some kind of REST API like:

GET /directories/:pursuance: - Returns an XML structure describing all folders visible to the user, subject to pursuance rules

GET /file/:fileid: - Returns a file, if the user has permission to access it

POST /fileupload - Uploads a file with specific name to specified folder ID, if user has permission


Most of the Pursuance infrastructure can be implemented relatively easily on the server side, using SQL for tracking accounts, groups, tags, and files, and using email as an underlying messaging technology. There’s a lot to build ourselves, but it’s a lot of pretty simple database and REST API work.

There are two major challenges with this approach:

The Client

We need a pretty sophisticated client, and it’s going to be built largely from scratch. If we build a web-app then we can re-use some pre-existing components (mostly repurposing some webmail client), but that’s still a lot of JavaScript and UI work, well outside my area of expertise. However, this is going to be the case for any approach we take. Even building on top of a platform like Keybase would require making significant UI additions for the rules system and issue tracking.

The Rule System

This is the heart of Pursuance, and what makes it more valuable than “email + Asana + Google Drive”. The rule system deserves a whole design document on its own. Is it a configuration file, with rules written in XML or JSON? Is it a domain specific language? Do we make it text-based and oriented towards programmers and sysadmins? This may be easier to implement and more versatile, but will require a kind of “pursuance specialist” per pursuance to set up the rule infrastructure. Alternatively, do we give it some kind of graphical editor like Snap in an effort to make the rules easily writable for any volunteer?

Once again, the rule system will be a significant obstacle no matter what infrastructure we build Pursuance on. This seems like a feasible design from a first glance.

Posted 6/24/19

Pursuance Project Initial Impressions

I recently had a conference call with several excellent people at the Pursuance Project, a platform facilitating collaboration between users working towards shared social goals, and enabling collaboration between separate or overlapping groups working towards related goals. If that sounds vague, broad, and ambitious, it’s because it is. This is about allowing new power structures over the Internet, with unprecedented flexibility. Let’s look at a few examples to clarify.

Use Cases

The Journalist Pyramid Scheme of Information Flow

Barrett Brown’s first example crystallized the vision for me. A journalist wants to crowd-source information gathering. Unfortunately, getting tips from the public is a high-noise low-signal endeavor: Many people will submit what is already public information, or will submit conspiracy theories and nonsense. Instead, what if the journalist has a handful of trusted contacts, and they charge these contacts with gathering information and filtering the noise before forwarding the interesting tips to the journalist. These trusted contacts find a number of sources of their own, and give them the same mission - gather information, filter the noise, and report the remaining content upstream. This trivially distributes labor so the journalist can talk to a handful of contacts and receive high-quality aggregated information from a large number of sources.

We can add extra features to this system for sending some messages above a filter, to identify incompetent members of the group, or re-submitting tips to random locations in the tree to increase the chance of and speed up propagating upwards to the journalist. The basic premise of distribution of labor remains the same.

The Document Tagging Problem

Another collaborative task featuring journalists: A group has a large number of leaked or FOIA’d documents. They need to crowd-source tagging the documents or building a wiki based on the documents, to summarize the information inside and make content searchable. This is a more sophisticated problem than “filter out gibberish and obvious falsehoods from the messages sent to you”, and involves assigning tasks to individual or groups of volunteers. There may be categories of volunteers (such as specialists that understand certain kinds of technical documents), and different members may have different permissions (only some trusted individuals can delete irrelevant documents). However, the problem is fundamentally similar in that we have groups of volunteers communicating within some kind of hierarchy to prevent the chaos of an unregulated chatroom like Slack or Discord.

Pursuance Objectives

Building a unique platform for each of the above use cases would be wasteful. Each would be relatively obscure, there would be lots of duplicate code, bringing users onto a new platform for a specific project is effort-expensive, and - critically - the two projects may want to interact! What if document taggers in the second use-case are also information sources in the first use-case, feeding information about important documents they find up to a journalist? Instead, it would be better if we had a unified platform for social collaboration of this kind, so users create a single account and can interact with any number of social action groups with ease.

This means that Pursuance cannot be built for a specific type of group, but must be adaptable to many group structures. In fact, the main function differentiating Pursuance from other messaging systems is a language for describing the social framework being used. Build a system that can describe and enforce the structure of the journalist-pyramid, and the document tagging expert-clusters, and other groups will be able to adapt it for a vast number of social needs.

Technical Requirements

What are the bare-necessities for meeting the above two use-cases? We need:

  • A concept of “users”

  • A concept of a group, which users can be a part of, called a “pursuance”

  • A way for users within a pursuance to message one another

  • A concept of “tasks” that can be assigned to users

  • Shared document storage

  • A “role” system for users, describing their expertise or position in the org

  • A permissions system that can be applied to pursuances, users, or roles, describing:

    • Who can contact who

    • What files can be read/written

    • What new permissions can be granted or revoked

Some nice-to-haves:

  • End-to-end encryption for messages between users

  • Zero-Knowledge encryption of files, so the hosting server cannot read them

  • Decentralization, allowing different pursuances to host content on their own servers and link them together

Group Discovery

The above structure is sufficient for running organizations with existing users. However, a large problem in activist and non-profit spaces is peer-discovery and avoiding duplication of effort. Pursuance should also provide an easy way to discover other organizations, perhaps by searching for their titles, descriptions, or viewing shared membership. Imagine something as follows:

Diagram of pursuance discovery

Maybe the circle size is based on the number of participating members, and the color indicates the number of messages sent / number of members in the past 30 days, as a vague indicator of activity. Edges indicate shared membership, pulling collaborating pursuances close on the map. Selecting a pursuance, like Signal, displays an additional description of the group’s purpose.

We need to add the following attributes to a pursuance to achieve this:

  • A pursuance title

  • A pursuance description

  • Some pursuance-level permissions for what information can be shared publicly:

    • Number of members

    • Identity of members?

    • Activity level

    • Messages

    • Files

Concluding Thoughts

This is a complicated project. One of the most difficult and important tasks for Pursuance will be making this technology intuitive, and hiding the complexity as much as possible when it is not critical for users to understand. From the perspective of the journalist in the first use-case, we probably want the journalist to see and send messages to their trusted contacts, and that’s all. Let the trusted contacts manage the complexity of the pyramid structure. Perhaps it makes sense for each group to have a “pursuance manager”, much like a sysadmin, who is more well-versed in the technology and manages the rules that make the pursuance tick.

Posted 6/21/19

Group-Grid Theory for Classifying Social Groups

I’ve recently been introduced to Group-Grid Theory, a framework from anthropology for classifying group dynamics and power structures. Let’s examine the model from an interest in intentional community building.

Under Group-Grid theory, communities are described along two axes, predictably “group” and “grid”. Here, “group” means how cohesive the community is, in terms of both clear delineation of membership (it’s obvious who’s a part of the community), and in how group-centric the thinking and policies within the group are. Slightly more complex is “grid”, which represents how structured the group is in terms of both leadership hierarchy and sophistication of / emphasis on rules.

Group/Grid Low Grid High Grid
Low Group Individualism Fatalism
High Group Enclavism Hierarchy

The above four groups are the most extreme corners of the axes - of course any real group will contain attributes along both axes, and land in a gradient rather than discrete categories.

The Four Archetypes


This is the organizational structure we’re most used to, for organizations like corporations, the military, and student clubs. Membership is explicitly defined by initiation rites including contracts, swearing-in ceremonies, paying dues, and attending meetings.

The organizations not only have well-defined rules, but formal leadership hierarchies like officer positions, defined in bylaws or community guidelines.

When problems occur in these communities, they fall back on rules to assign responsibility or blame, and determine what courses of action to take.


Enclaves are groups without complex, well-defined structure, leadership, or rules, but clearly-defined membership qualities. Examples include communes, families, and other “horizontal” organizations.

These organizations are not without power dynamics, and frequently assign implicit authority based on experience or age. Membership is based on physical proximity (often living together), shared contributions of labor, or shared genetics.

In these organizations, problems are often framed as something external threatening the in-group. Conflict resolution revolves around the in-group collaborating to either deal with the external force, or in extreme circumstances, growing or shrinking the in-group to maintain cohesion.


Individualist organizations, as the name implies, have neither strong respect for authority nor clear group boundaries. These can include loose social “scenes” like hactivism or security culture, social movements like Black Lives Matter, or loosely organized hate groups. There are shared attributes in the organization, such as an ethos or area of interest - otherwise there would be no social group at all - but there is minimal structure beyond this.

Membership in these groups is usually permeable and self-defined: What makes someone a part of Anonymous beyond declaring that they are? What makes them no longer a member of that community, except ceasing to speak in those circles and dropping the Anonymous title? As members join and leave with ease, tracking the size and makeup of these groups is extremely challenging.

When these groups face pressure they fragment easily, making multiple overlapping communities to encompass differences in opinion. This fragmentation can be due to disagreements over ideology, hatred or reverence of a particularly person, group, or action, or similar schisms within the in-group. This apparent lack of consistency can in some ways serve as stability, allowing groups to adapt to change by redefining themselves with ease.


Fatalism describes organizations with sophisticated rules and rituals, but no communal behavior or allegiance. One example is capitalism as an ecosystem: There are rules of behavior governing money-making activities, but there is no care given to other participants in the community. In ultra-capitalist models, corporations are cut-throat to both one another and their own employees, prioritizing money-making over community health. Other fatalist groups include refugees, governed by the system of rules in their host country, without being cared-for members of it in the same way as a citizen.

These groups are called fatalist, because there are no tools for addressing conflict: The leadership structure hands down decisions and their effects, and there is little recourse for those impacted. The community holds little power, and has little trust in the benevolence of the grid.

Early Thoughts

The Group/Grid lens illustrates trade-offs between making groups with formal rules and leadership systems, and building a more anarchic self-organized group. It also shows benefits of declaring formal membership criteria and focusing on community-building, or allowing permeable, self-defined membership. Early intuitions are that a focus on community builds a more committed membership, which will be less prone to fragmentation and dissolution. Unfortunately, strong group identity can also breed toxic group dynamics, as members are more invested in seeing their vision realized and more resistant to “walking away” when the group moves in an incompatible direction. Similarly, group hierarchy can be efficient for decision-making, but can alienate the community if applied bluntly. Hierarchy works great at a local level, as with school clubs, where it’s effectively just division of labor. If the grid is no longer operated by the community, then we inevitably reach fatalism, which has extreme drawbacks.

These are sophomoric first impressions, but now I have group-grid as a tool for describing and analyzing groups, and can apply it moving forwards. I’ll probably return to this topic in future posts as it becomes relevant.

Posted 6/13/19

Steganography and Steganalysis with Fourier Transforms

This post is a high-level introduction to hiding messages in images using Fourier Transforms on the color data. This technique is less susceptible to accidental destruction than techniques like Least Significant Bit steganography, while remaining far more challenging to detect than metadata-based approaches like storing secret messages in image comments. No background in steganography or Fourier Transforms is expected. This post is largely based on “Image Steganography and Steganalysis”, by Mayra Bachrach and Frank Shih.

Image Steganography, the Basics

Our objective is to hide secret messages, whether they be text or arbitrary files, inside images. Images are an attractive secret-message envelope, since they can be transferred in a number of ways (texting, emails, posting on forums, sharing through Google Photos, etc), and do not raise suspicions in many contexts. Before discussing Fourier Transform steganography, we’ll talk about some simpler approaches as context.

Most images include metadata for storing various statistics about the image contents. This metadata can be viewed and edited with tools like exiftool:

% exiftool qr.png 
ExifTool Version Number         : 11.01
File Name                       : qr.png
Directory                       : .
File Size                       : 9.7 kB
File Modification Date/Time     : 2019:04:27 20:10:55-04:00
File Access Date/Time           : 2019:04:27 20:10:57-04:00
File Inode Change Date/Time     : 2019:04:27 20:10:56-04:00
File Permissions                : rw-rw-rw-
File Type                       : PNG
File Type Extension             : png
MIME Type                       : image/png
Image Width                     : 246
Image Height                    : 246
Bit Depth                       : 8
Color Type                      : RGB with Alpha
Compression                     : Deflate/Inflate
Filter                          : Adaptive
Interlace                       : Noninterlaced
Image Size                      : 246x246
Megapixels                      : 0.061

A first trivial attempt at message hiding is simply putting your secret message in one of these metadata fields. Unfortunately, this is easy to detect with automated tools, as most images won’t have many human-readable strings in them. This data may also be accidentally deleted, as many web services strip image data intentionally, or unintentionally lose it when translating from one image format (like PNG) to another (like JPG).

A slightly more sophisticated solution is Least Significant Bit steganography. The synopsis is:

  • Every pixel’s color is represented as three bytes, for Red, Green, and Blue

  • A change to the least significant bit will result in a nearly-identical color, and the difference will not be perceptible to the human eye

  • We can represent our secret message as a binary sequence

  • We can set the least significant bit of each pixel to the bits from our secret message

Done! And no longer trivially detectable! Even if someone does find your message, it will be hard to prove it is a secret message if it’s encrypted. Unfortunately this method is also susceptible to accidental breaks: If an image is resized or translated to another format then it will be recompressed, and these least significant bits are likely to be damaged in the process.

We want an equally secret message encoding system that is as difficult to detect, but less susceptible to damage.

Fourier Transforms

The math behind Fourier Transforms is complicated, but the intuition is not. Consider a sine wave:

Sine Wave Example

This wave can be described as a frequency and amplitude - let those be x- and y-coordinates in 2-D space:

Sine Wave Frequency Map

We can add a second wave, with a different frequency:

2 Sine Wave Example

And when we combine the two signals we can represent the combination as two points in frequency-amplitude space, representing the two waves we’ve added:

Sine Waves Combines

(The code used to generate the above images can be found here)

This leads to three conclusions:

  • Any arbitrarily complicated signal can be represented as a series of sine waves, layered on top of one another

  • A finite-length signal can be represented with a finite number of sine waves

  • The original signal can be reconstructed by taking the constituent sine waves and combining them

The Discrete Fourier Transform is an algorithm that derives these waves, given a signal of finite length described as discrete samples.

Why would we want to represent waves in this way? A plethora of reasons. Noise reduction, by deleting all waves with an amplitude below a specific threshold. Noise isolation, by deleting all waves not within a specific frequency range (such as the human vocal range). Noise correction, by shifting the frequency of some waves (as used in auto-tune). Ultimately, the Fourier Transform is the foundation of much audio, video, and image compression and manipulation.

Converting Images to a Fourier Matrix

The paper is a little hand-wavey on this part, but images can be expressed as a layering of sine and cosine waves with input based on the pixel coordinates. As with the one-dimensional Fourier transforms used in audio analysis, this process can be reversed to reproduce the original image. The number of samples and waves used determines the accuracy of the approximation, and thus the accuracy when inverting the function to recreate the original image. This video goes into further detail on the process, which is commonly used for image editing and compression.

Embedding and Extracting Messages using Fourier Series

Next, the user expresses their embedded message as a Fourier series. This can be done in a variety of ways, from adapting the waveform of an audio message, to encoding text as a bitsequence and solving the Fourier series for that sequence, to simply Fourier encoding a second image. Once the user has a message encoded as a Fourier series they can easily superimpose the signal by adding coefficients to the corresponding polynomials in the image matrix. The matrix can then be reversed, translating from the frequency domain back to the spatial image domain. The effect is a slight dithering, or static, applied to the image. By shifting the frequency of the hidden message up or down the user may adjust the static properties until a subtle effect is achieved.

The steganographic data can be relatively easily extracted given a copy of the original image. Comparing the pixels of the original and modified image can demonstrate that something has been changed, but not a discernible pattern that can be distinguished from artifacting resulting from lossy image compression, such as what one would see by switching the data format from PNG to JPEG. However, by converting both images to their Fourier matrix representation and subtracting from each other, anyone can retrieve the polynomial representing the encoded message. If the message was frequency adjusted to minimize visual presence, it must now be frequency shifted back, before decoding from Fourier to the original format (audio, bitsequence, etc).

If the unaltered image is not available, because the photo is an original rather than something taken from the web, then a simple delta is impossible. Instead, statistical analysis is necessary. Once again, the Fourier transform is critical, as it allows for pattern recognition and signal detection, differentiating between normal image frequencies and the structured data resulting from layering a message on top of the image.

Steganalysis with Fourier Transforms

The same Fourier-delta technique can be used for the more difficult task of detecting and extracting steganography of an unknown format. In this case, we are given an image, and need to establish both whether there is a hidden message, and preferably, what it is. Given an arbitrary image, we first need to establish a baseline. We can perform a reverse image search and find similar images, with an identical appearance but different hashes. We then compare each baseline image to the possibly steganographic image by converting both to Fourier matrices and calculating a delta, as above. We must then perform noise reduction to remove minor perturbations such as image re-encoding and re-sizing artifacting. If the remaining delta is statistically significant, then there is evidence of a secret signal. This completes the first step, identifying the presence of a steganographic message.

Unfortunately, interpreting this identified message is beyond the scope of the paper. Both participants in a secret conversation can pre-agree on an encoding scheme such as audio, bitstrings, or an embedded image. Given only a frequency spectrum, an analyst needs to attempt multiple encodings until something meaningful is produced. Particularly if the frequency-shifting outlined above has been performed, this is an extremely tedious process, better suited (at least so far) to manual inspection and intuitive analysis than a purely automated design.

Posted 5/8/19

Infection and Quarantine

Network Science is a relatively young discipline, which uses a small number of basic models to represent a wide variety of network phenomenon, ranging from human social interactions, to food webs, to the structure of the power grid.

This post focuses on social communities, historically modeled as random networks, where every person has a percent change of having a social connection to any other person. The following is an example of a random network, where the circular nodes represent people, and the edges between circles indicate a social link:

Example of a Random Network

The Topic

Of particular interest to me are information cascades, where some “state” is passed along from node to node, rippling through the network. These “states” can represent the spread of disease, a meme, a political ideology, or any number of things that can be passed on through human interaction.

A starting point for understanding information cascades is disease propagation. Traditionally, infection models run in discrete timesteps, and assume that the infection has a percent chance of spreading across an edge each turn, and will continue to spread until the infected are no longer contagious. This is an adequate model for large-scale disease propagation, where each disease has a different contagious period and infection rate. This basic model is less accurate when applied to small communities, where the differences between individuals cannot be statistically ignored, or when applied to the spread of information rather than disease.


Two research papers by Santa Fe Institute researchers extend the infection model to better reflect human behavior and repurpose it to examine the spread of ideas. Duncan Watts’ “A Simple Model of Global Cascades on Random Networks” proposes that each individual has a different susceptibility to infection, which they call an “Activation Threshold”, represented as a percentage of a node’s peers that must be infected before the node will be infected.

To understand this idea, consider a social platform like Twitter or Google+ in its infancy. Some people are early adopters - as soon as they hear a friend is on the platform they will join the platform to participate. Other people are more apprehensive, but once a majority of their peers have Facebook accounts then they, too, will make Facebook accounts so they do not miss out on significant social community. Early adopters have a low activation threshold, apprehensive people have a high threshold.

The result is that a cascade can only succeed in overwhelming a community if it reaches a “critical mass” where adoption is high enough to pass every member’s activation threshold:

Example of total network infection

If the activation threshold is too high, or the cascade starts in a poor location where many peers have high thresholds, then the cascade will be unable to reach a critical mass, then the infection will fizzle out. This may explain why some ideas, like Twitter or Facebook, succeed, while others like Google+ or Voat, do not. A failed cascade can convert a small cluster of users, but is ultimately contained by high threshold peers:

Example of failed network infection

Akbarpour and Jackson propose a radically different, if simple, extension in “Diffusion in Networks and the Virtue of Burstiness”. They argue that it is insufficient for two people to have a social link for an idea to spread - the users must meet in the same place at the same time. People fall in one of three patterns:

  1. Active for long periods, inactive for long periods. Think of office workers that are only at work from 9 to 5. If two offices are geographically separated, they can only share information while their work days overlap.

  2. Active then inactive in quick succession. Perhaps someone that takes frequent breaks from work to have side conversations with their neighbors.

  3. Random activity patterns. Likely better for representing online communities where someone can check in from their phone at any time.

My Work

While random networks are simple, human communities are rarely random. Since their introduction in 1999, human communities have more frequently been modeled as scale-free networks. In a scale-free network, new members of the community prefer to connect to well-connected or “popular” members. This results in a small number of highly connected, very social individuals referred to as “hubs”, each with many peers with far fewer connections. Here is an example scale-free network demonstrating the hub-clustering phenomenon:

Example of a Scale-Free Network

My objective is to combine the above two papers, and apply the results to scale-free networks. In this configuration, the activity pattern and activation threshold of the hubs is of paramount importance, since the hubs act as gatekeepers between different sections of the network.

When a hub is infected, the cascade will spread rapidly. The hub is connected to a large number of peers, and makes up a significant percentage of the neighbors for each of those peers, since most nodes in the network only have a handful of connections. This means an infected hub can overcome even a relatively high activation threshold.

Scale Free with infected hub

Compromising even a single hub will allow the cascade to reach new branches of the network and dramatically furthers infection:

Scale Free with infected hub

However, infecting a hub is challenging, because hubs have so many peers that even a low activation threshold can prove to be an obstacle. Without capturing a hub, the spread of a cascade is severely hindered:

Scale Free with no infected hub

Even with highly susceptible peers, well-protected hubs can isolate a cascade to a district with ease:

Scale Free with no infected hub

Next Steps

So far, I have implemented Watts’ idea of “activation thresholds” on scale-free networks, and have implemented-but-not-thoroughly-explored the activity models from the Burstiness paper. The next step is to examine the interplay between activity types and activation thresholds.

These preliminary results suggest that highly centralized, regimented networks are better able to suppress cascades, but when a cascade does spread, it will be rapid, dramatic, and destabilizing. A fully decentralized, random network has far more frequent and chaotic cascades. Is there a mid-level topology, where small cascades are allowed to occur frequently, but the thread of a catastrophic cascade is minimized? I would like to investigate this further.

Posted 10/23/18

Patching Registration Checks, An Introduction

Taking a break from recent social architecture posts, here’s some more technical security content. It’s a pretty soft introduction to reverse engineering and binary patching for those unfamiliar with the topic.


One of the tools I use in my social research recently became abandonware. The product requires annual registration, but earlier this year the developer’s website disappeared, no one has been able to reach them by email for months, and I literally cannot give them my money to reactivate the product. Unfortunately, I am still reliant on the tool for my work. With no alternative, let’s see if the software can be reactivated through more technical means.


The tool in question is a graph layout plugin for Cytoscape, so it’s distributed as a JAR file. JAR files are bundled executables in Java - they’re literally ZIP archives with a few specific files inside.

The goal is to find where the software checks whether it’s registered, and patch the binary so it always returns true.

Since this is Java, we’ll start with a decompiler. The goal of a decompiler is to go from compiled bytecode back to human-readable sourcecode in a language like Java or C. On platforms like x86 this is often very messy and unreliable, both because x86 is disgustingly complicated, and because there are many valid C programs that could compile to the same assembly. Fortunately, today we’re working with JVM bytecode, which is relatively clean and well-understood, so the JVM to Java decompilers are surprisingly capable.

I’ll be using JD-Gui, the “Java Decompiler”. Sounds like the tool for the job. Open up JD-GUI, tell it to open the JAR file, and we’re presented with a screen like this:

JD-GUI opening a JAR

Wandering through the .class files in com (where most of the code resides), we eventually find reference to public static boolean isActivated(), which sure sounds promising. Here’s the method definition:

JD-Gui isActivated()

If either the product is activated, or it’s still in a trial period, the function returns true. This appears to be our golden ticket. Let’s change this method so that it always returns true.


There are two techniques for patching Java code. The apparently simple option would be to use a decompiler to get Java code out of this JAR, change the Java code, then recompile it back in to a JAR. However, this assumes the decompiler was 100% perfectly accurate, and the JAR you get at the end is going to look rather different than the one you started with. Think throwing a sentence in to Google Translate and going from English to Japanese back to English again.

The cleaner technique is to leave the JAR intact, and patch this one method while it is still JVM bytecode. The decompiler helped find the target, but we’ll patch at the assembly level.

First, we’ll need to extract contents of the JAR. Most JVM bytecode editors work at the level of .class files, and won’t auto-extract and repack the JAR for us.

$ mkdir foo
$ cp foo.jar foo
$ cd foo
$ jar -xf foo.jar

Now all the individual files visible in the JD-Gui sidebar are unpacked on the filesystem to edit at our leisure. Let’s grab the Java Bytecode Editor, open the DualLicenseManager.class file, and scroll to the isActivated() method:

JBE Examining Bytecode

Above is the assembly for the short Java if-statement we saw in JD-GUI. If isProductActivated() returns true, jump to line 14 and push a 1 on the stack before jumping to line 19. Else, if isTrialActivated() returns false, jump to line 18 and push a 0 on the stack. Then, return the top item on the stack.

Uglier than the Java, but not hard to follow. The patch is trivially simple - change the iconst_0 to an iconst_1, so that no matter what, the method always pushes and returns a 1.

JBE Patching Bytecode

Then we save the method, and it’s time to re-pack the JAR.


Re-creating the JAR is only slightly more complicated than unpacking:

$ jar -cvf foo.jar *
$ jar -uvfm foo.jar META-INF/MANIFEST.MF

For some reason creating a JAR from the contents of a folder ignores the manifest and creates a new one. Since we specifically want to include the contents of the manifest file (which include some metadata necessary for the plugin to connect with Cytoscape), we explicitly update the manifest of the JAR with the file.


From here, we can reinstall the plugin in Cytoscape and it runs like a charm.

Often, this kind of binary patching is less straightforward. This work would have been much more time-consuming (though not much more challenging) if the executable had not included symbols, meaning none of the method names are known. Most C and C++ executables do not include symbols, so you are forced to learn what each function does by reading the assembly or looking at the context in which the function is called. This is done for performance more than security, since including symbols makes the executable larger, and is only helpful for the developers during debugging.

More security-minded engineers will use tools like Packers to obfuscate how the code works and make it more difficult to find the relevant method. These are not insurmountable, but usually require watching the packer decompress the main program in memory, waiting for the perfect moment, then plucking it from memory to create an unpacked executable.

Another option is including some kind of checksumming so that the program can detect that it has been tampered with, and refuses to run, or modifies its behavior. This is rarely helpful however, since the reverse engineer can simply look for the appropriate hasBeenTamperedWith() function and patch it to always return false. An insidious programmer could try to hide this kind of code, but it’s a cat-and-mouse game with the reverse engineer that will only buy them time.

Ultimately, this tool was built by scientists, not battle-hardened security engineers, and included no such counter-measures. Should they resurface I will gladly continue paying for their very useful software.

Posted 7/31/18

Democratic Censorship

There’s been a lot of discourse recently about the responsibility social media giants like Facebook and Twitter have to police their communities. Facebook incites violence by allowing false-rumors to circulate. Twitter only recently banned large communities of neo-nazis and white supremacists organizing on the site. Discord continues to be an organizational hub for Nazis and the Alt-Right. There’s been plenty of discussion about why these platforms have so little moderatorship, ranging from their business model (incendiary content drives views and is beneficial for Facebook), to a lack of resources, to a lack of incentive.

I’d like to explore a new side of the issue: Why should a private company have the role of a cultural censor, and how can we redesign our social media to democratize censorship?

To be absolutely clear, censorship serves an important role in social media in stopping verbal and emotional abuse, stalking, toxic content, and hate speech. It can also harm at-risk communities when applied too broadly, as seen in recent well-intentioned U.S. legislation endangering sex workers.

On Freedom of Speech

Censorship within the context of social media is not incompatible with free-speech. First, Freedom of Speech in the United States is largely regarded to apply to government criticism, political speech, and advocacy of unpopular ideas. These do not traditionally include speech inciting immediate violence, obscenity, or inherently illegal content like child pornography. Since stalking, abuse, and hate-speech do not contribute to a public social or political discourse, they fall squarely outside the domain of the USA’s first amendment.

Second, it’s important to note that censorship in social media means a post is deleted or an account is locked. Being banned from a platform is more akin to exile than to arrest, and leaves the opportunity to form a new community accepting of whatever content was banned.

Finally there’s the argument that freedom of speech applies only to the government and public spaces, and is inapplicable to a privately-owned online space like Twitter or Facebook. I think had the U.S. Bill of Rights been written after the genesis of the Internet this would be a non-issue, and we would have a definition for a public commons online. Regardless, I want to talk about what should be, rather than what is legally excusable.

The Trouble with Corporate Censors

Corporations have public perceptions which effect their valuations. Therefore, any censorship by the company beyond what is legally required will be predominantly focused on protecting the ‘image’ of the company and avoiding controversy so they are not branded as a safe-haven for bigots, nazis, or criminals.

Consider Apple’s censorship of the iOS App Store - repeatedly banning drone-strike maps with minimal explanatory feedback. I don’t think Apple made the wrong decision here; they reasonably didn’t want to be at the epicenter of a patriotism/pro-military/anti-war-movement debate, since it has nothing to do with their corporation or public values. However, I do think that it’s unacceptable that Apple is in the position of having this censorship choice to begin with. A private corporation, once they have sold me a phone, should not have say over what I can and cannot use that phone to do. Cue Free Software Foundation and Electronic Frontier Foundation essays on the rights of the user.

The same argument applies to social media. Facebook and Twitter have a vested interest in limiting conversations that reflect poorly on them, but do not otherwise need to engender a healthy community dynamic.

Sites like Reddit that are community-moderated have an advantage here: Their communities are self-policing, both via the main userbase downvoting inappropriate messages until they are hidden, and via appointed moderators directly removing unacceptable posts. This works well in large subreddits, but since moderators have authority only within their own sub-communities there are still entire subreddits accepting of or dedicated to unacceptable content, and there are no moderators to review private messages or ban users site wide. A scalable solution will require stronger public powers.

Federated Communities

The privacy, anonymity, and counter-cultural communities have been advocating “federated” services like Mastadon as an alternative to centralized systems like Twitter and Facebook. The premise is simple: Anyone can run their own miniature social network, and the networks can be linked at will to create a larger community.

Privacy Researcher Sarah Jamie Lewis has written about the limitations of federated models before, but it boils down to “You aren’t creating a decentralized democratic system, you’re creating several linked centralized systems, and concentrating power in the hands of a few.” With regards to censorship this means moving from corporate censors to a handful of individual censors. Perhaps an improvement, but not a great one. While in theory users could react to censorship by creating a new Mastadon instance and flocking to it, in reality users are concentrated around a handful of large servers where the community is most vibrant.

Components of a Solution

A truly self-regulatory social community should place control over censorship of content in the hands of the public, exclusively. When this leads to a Tyranny of the Majority (as I have no doubt it would), then the effected minorities have an incentive to build a new instance of the social network where they can speak openly. This is not an ideal solution, but is at least a significant improvement over current power dynamics.

Community censorship may take the form of voting, as in Reddit’s “Upvotes” and “Downvotes”. It may involve a majority-consensus to expel a user from the community. It may look like a more sophisticated republic, where representatives are elected to create a temporary “censorship board” that removes toxic users after quick deliberation. The key is to involve the participants of the community in every stage of decision making, so that they shape their own community standards instead of having them delivered by a corporate benefactor.

Care needs to be taken to prevent bots from distorting these systems of governance, and giving a handful of users de-facto censorship authority. Fortunately, this is a technical problem that’s been explored for a long time, and can be stifled by deploying anti-bot measures like CAPTCHAs, or by instituting some system like “voting for representatives on a blockchain”, where creating an army of bot-votes would become prohibitively expensive.

This should be not only compatible, but desirable, for social media companies. Allowing the community to self-rule shifts the responsibility for content control away from the platform provider, and means they no longer need to hire enormous translator and moderator teams to maintain community standards.

Posted 4/22/18

Hacker Community Espionage

I recently got to see a talk at the Chaos Communication Congress titled “When the Dutch secret service knocks on your door”, with the following description:

This is a story of when the Dutch secret service knocked on my door just after OHM2013, what some of the events that lead up to this, our guesses on why they did this and how to create an environment where we can talk about these things instead of keeping silent.

Since the talk was not recorded, the following is my synopsis and thoughts. This post was written about a week after the talk, so some facts may be distorted by poor memory recall.

  • The speaker was approached by members of the Dutch secret service at his parents’ house. They initially identified themselves as members of the department of the interior, but when asked whether they were part of the secret service, they capitulated.

  • The agents began by offering all-expenses-paid travel to any hackathon or hackerspace. All the speaker needed to do was write a report about their experience and send it back. A relatively harmless act, but it means they would be an unannounced informant in hacker communities.

  • When the author refused, the agents switched to harder recruitment techniques. They pursued the author at the gym, sat nearby in cafes when the author held meetings for nonprofits, and likely deployed an IMSI catcher to track them at a conference.

  • Eventually, the author got in contact with other members of the hacker community that had also been approached. Some of them went further through the recruitment process. The offers grew, including “attend our secret hacker summer camp, we’ll let you play with toys you’ve never heard of,” and “If you want to hack anything we can make sure the police never find out.” In either of these cases the recruit is further indebted to the secret service, either by signing NDAs or similar legal commitments to protect government secrets, or by direct threat, wherein the government can restore the recruit’s disappeared criminal charges at any time.

I have two chief concerns about this. First, given how blatant the secret service was in their recruitment attempts, and that we only heard about their attempts in December of 2017, we can safely assume many people accepted the government’s offer. Therefore, there are likely many informants working for the secret service already.

Second, this talk was about the Netherlands - a relatively small country not known for their excessive surveillance regimes like the Five Eyes. If the Netherlands has a large group of informants spying on hackerspaces and conferences around the globe, then many other countries will as well, not to mention more extreme measures likely taken by countries with more resources.

From this, we can conclude there are likely informants in every talk at significant conferences. Every hackerspace with more than token attendance is monitored. This is not unprecedented - the FBI had a vast array of informants during the COINTELPRO era that infiltrated leftist movements throughout the United States (along with much less savory groups like the KKK), and since shortly after 9/11 has used a large group of Muslim informants to search for would-be terrorists.

Posted 1/7/18

Alcoholics Anonymous as Decentralized Architecture

Most examples of decentralized organization are contemporary: Black Lives Matter, Antifa, the Alt-Right, and other movements developed largely on social media. Older examples of social decentralization tend to be failures: Collapsed Hippie communes of the 60s, anarchist and communist movements that quickly collapsed or devolved to authoritarianism, the “self-balancing free market,” and so on.

But not all leaderless movements are short-lived failures. One excellent example is Alcoholics Anonymous: An 82-year-old mutual aid institution dedicated to helping alcoholics stay sober. Aside from their age, AA is a good subject for study because they’ve engaged in a great deal of self-analysis, and have very explicitly documented their organizational practices.

Let’s examine AA’s Twelve Traditions and see what can be generalized to other organizations. The twelve traditions are reproduced below:

  1. Our common welfare should come first; personal recovery depends on AA unity.

  2. For our group purpose there is but one ultimate authority - a loving God as He may express Himself in our group conscience.

  3. The only requirement for AA membership is a desire to stop drinking.

  4. Each group should be autonomous except in matters affecting other groups or AA as a whole.

  5. Each group has but one primary purpose - to carry its message to the alcoholic who still suffers.

  6. An AA group ought never endorse, finance or lend the AA name to any related facility or outside enterprise, lest problems of money, property and prestige divert us from our primary purpose.

  7. Every AA group ought to be fully self-supporting, declining outside contributions.

  8. Alcoholics Anonymous should remain forever nonprofessional, but our service centers may employ special workers.

  9. AA, as such, ought never be organized; but we may create service boards or committees directly responsible to those they serve

  10. Alcoholics Anonymous has no opinion on outside issues; hence the AA name ought never be drawn into public controversy.

  11. Our public relations policy is based on attraction rather than promotion; we need always maintain personal anonymity at the level of press, radio and films.

  12. Anonymity is the spritual foundation of all our traditions, ever reminding us to place principles before personalitites.

The above twelve rules can be distilled to three themes:

  • The group comes first

  • The group is single-issue

  • The group should be independent of any external or internal structures

The first theme stresses anonymity in an interesting way: Not to protect individual members (many of whom want to be anonymous when in an organization like AA), but to prevent the rise of “rock-stars”, or powerful individuals with celebrity status. Personal power is prone to abuse, both at an inter-personal level (see the plethora of sexual abuse cases in the news right now), and at a structural level, where the organization becomes dependent on this single individual, and is drawn in to any conflict surrounding the celebrity.

The solution to a rock-star is to kick them out of the organization, and maintain a healthier community without them. AA has gone a step further however, and outlines how to prevent the rise of a rock-star by preventing any personal identification when communicating to the outside world. When you are speaking to the press you are Alcoholics Anonymous, and may not use your name. For further discussion on rock-stars in tech communities, see this article.

The single-issue design is an unusual choice. Many social movements like the Black Panthers stress solidarity, the idea that we should unite many movements to increase participants and pool resources. This is the same principle behind a general strike, and broad, cross-issue activist networks like the Indivisible movement. However, focusing on a single issue continues the trend of resisting corruption and abuse of power. AA keeps a very strict, simple mission, with no deviations.

The last theme, total organizational independence, is also unusual. Organizations that fear external attack, like terrorist cells, may operate in isolation from other cells with little to no higher-level coordination. Organizations avoiding internal corruption, like the Occupy movement, or fraternities, may limit internal leadership and centralization of power using systems like Robert’s Rules of Order or Clusters & Spokes Councils, or they may organize more anarchically, through organic discussion on social media. Avoiding both internal and external hierarchy, however, sacrifices both large-scale coordination and quick decision making. This works for Alcoholics Anonymous, because their mission is predefined and doesn’t require a great deal of complex leadership and decision making. It is also used by Antifa, where local groups have no contact with one another and rely on collective sentiment to decide on actions.

Overall, AA is an interesting introduction to decentralized organizations. I will revisit these ideas as I learn more.

Posted 1/6/18

Halftone QR Codes

I recently encountered a very neat encoding technique for embedding images into Quick Response Codes, like so:

Halftone QR Code Example

A full research paper on the topic can be found here, but the core of the algorithm is actually very simple:

  1. Generate the QR code with the data you want

  2. Dither the image you want to embed, creating a black and white approximation at the appropriate size

  3. Triple the size of the QR code, such that each QR block is now represented by a grid of 9 pixels

  4. Set the 9 pixels to values from the dithered image

  5. Set the middle of the 9 pixels to whatever the color of the QR block was supposed to be

  6. Redraw the required control blocks on top in full detail, to make sure scanners identify the presence of the code

That’s it! Setting the middle pixel of each cluster of 9 generally lets QR readers get the correct value for the block, and gives you 8 pixels to represent an image with. Occasionally a block will be misread, but the QR standard includes lots of redundant checksumming blocks to repair damage automatically, so the correct data will almost always be recoverable.

There is a reference implementation in JavaScript of the algorithm I’ve described. I have extended that code so that when a pixel on the original image is transparent the corresponding pixel of the final image is filled in with QR block data instead of dither data. The result is that the original QR code “bleeds in” to any space unused by the image, so you get this:

Halftone QR with background bleed

Instead of this:

Halftone QR without background bleed

This both makes the code scan more reliably and makes it more visually apparent to a casual observer that they are looking at a QR code.

The original researchers take this approach several steps further, and repeatedly perturb the dithered image to get a result that both looks better and scans more reliably. They also create an “importance matrix” to help determine which features of the image are most critical and should be prioritized in the QR rendering. Their code can be found here, but be warned that it’s a mess of C++ with Boost written for Microsoft’s Visual Studio on Windows, and I haven’t gotten it running. While their enhancements yield a marked improvement in image quality, I wish to forgo the tremendous complexity increase necessarily to implement them.

Posted 12/19/17

Cooperative Censorship

I have long been an opponent of censorship by any authority. Suppression of ideas stifles discussion, and supports corruption, authoritarianism, and antiquated, bigoted ideas. I have put a lot of thought in to distributed systems, like Tor or FreeNet, that circumvent censorship, or make it possible to host content that cannot be censored.

However, the recent Charlottesville protests show another side of the issue. Giving the alt-right a prolific voice online and in our media has allowed the Nazi ideology to flourish. This isn’t about spreading well-reasoned ideas or holding educational discussion - the goal of white supremacists is to share a message of racial superiority and discrimination based wholly in old hateful prejudice, not science or intellectual debate.

The progress of different hosting providers shutting down the Daily Stormer neo-Nazi community site shows how hesitant Corporate America is to censor - whether out of concern for bad PR, loss of revenue, perception of being responsible for the content they facilitate distribution of, or (less likely) an ideological opposition to censorship.

Ultimately, I still belief in the superiority of decentralized systems. Money-driven corporations like GoDaddy and Cloudflare should not be in the position where they are cultural gatekeepers that decide what content is acceptable and what is not. At the same time, a distributed system that prevents censorship entirely may provide an unreasonably accessible platform for hate speech. No censorship is preferable to authoritarian censorship, but is there a way to build distributed community censorship, where widespread rejection of content like white supremacy can stop its spread, without allowing easy abuse of power? If it is not designed carefully such a system would be prone to Tyranny of the Majority, where any minority groups or interests can be oppressed by the majority. Worse yet, a poorly designed system may allow a large number of bots to “sway the majority”, effectively returning to an oligarchic “tyranny of the minority with power” model. But before ruling the concept out, let’s explore the possibility some…

Existing “Distributed Censorship” Models

Decentralized Twitter clone Mastadon takes a multiple-instances approach to censorship. Effectively, each Mastadon server is linked together, or “federated”, but can refuse to federate with particular servers if the server admin chooses to. Each server then has its own content guidelines - maybe one server allows pornography, while another server forbids pornography and will not distribute posts from servers that do. This allows for evasion of censorship and the creation of communities around any subject, but content from those communities will not spread far without support from other servers.

Facebook lookalike Diaspora has a similar design, distributing across many independently operated servers called “pods”. However, content distribution is primarily decided by the user, not the pod administrator. While the pod administrator chooses what other pods to link to, the user independently chooses which users in those pods their posts will be sent to, with a feature called “aspects”. This ideally lets a user segment their friend groups from family or work colleagues, all within the same account, although there is nothing preventing users from registering separate accounts to achieve the same goal.

Both of these models distribute censorship power to server administrators, similar to forum or subreddit moderators. This is a step in the right direction from corporate control, but still creates power inequality between the relatively few server operators and the multitude of users. In the Mastadon example, the Mastadon Monitoring Project estimates that there are about 2400 servers, and 1.5 million registered users. That is, about 0.16% of the population have censorship control. While there’s nothing technical stopping a user from starting their own server and joining the 0.16%, it does require a higher expertise level, a server to run the software on, and a higher time commitment. This necessarily precludes most users from participating in censorship (and if we had 1.5 million Mastadon servers then administering censorship would be unwieldy).

Other Researcher’s Thoughts

The Digital Currency Initiative and the Center for Civic Media (both MIT groups) released a relevant report recently on decentralized web technologies, their benefits regarding censorship, and adoption problems the technologies face. While the report does not address the desirability of censoring hate speech, it does bring up the interesting point that content selection algorithms (like the code that decides what to show on your Twitter or Facebook news feeds) are as important to censorship as actual control of what posts are blocked. This presents something further to think about - is there a way to place more of the selection algorithm under user control without loading them down with technical complexity? This would allow for automatic but democratic censorship, that may alleviate the disproportionate power structures described above.

Posted 8/19/17

Braess’s Paradox

I had the great fortune of seeing a talk by Brian Hayes on Braess’s Paradox, an interesting network congestion phenomenon. In this post I’ll talk about the problem, and some ramifications for other fields.

The Problem

Consider a network of four roads. Two roads are extremely wide, and are effectively uncongested, regardless of how many cars are present. They still have speed limits, so we’ll say there’s a constant traversal time of “one hour” for these roads. The other two roads, while more direct and thereby faster, have only a few lanes, and are extremely prone to congestion. As an estimate, we’ll say the speed it takes to traverse these roads scales linearly with “N”, the number of cars on the road, such that if all the cars are one one road it will take one hour to travel on.

Plain Network

If a driver wants to get from point A to point B, what route is fastest? Clearly, by symmetry, the two paths are the same length. Therefore, the driver should take whatever path is less-congested, or select randomly if congestion is equal. Since half the cars will be on each path, the total commute time is about 1.5 hours for all drivers.

However, consider the following change:

Magic Shortcut Network

In this network we’ve added a new path that’s extremely fast (no speed limits, because they believe in freedom), to the point that we’ll consider it instantaneous.

What is the optimal path for a driver now? A lone driver will obviously take the first direct road, then the shortcut, then the second direct road. However, if all “N” drivers take this route the small roads will be overloaded, increasing their travel time to one hour each. The total commute for each driver will now be two hours.

Consider that you are about to start driving, and the roads are overloaded. If you take the short route your commute will be two hours long. However, if you take the long route your commute will be two hours long, and the other roads will be less overloaded (since without you only N-1 cars are taking the route), so everyone else will have a commute slightly shorter than two hours. This means from a greedy algorithm perspective there is always an incentive to take the more direct route, and help perpetuate the traffic problem.

Simply put, adding a shortcut to the network made performance worse, not better.

The Solution

There are a number of potential solutions to the problem. Law enforcement might demand that drivers select their routes randomly, saving everyone half an hour of commute. Similarly, self-driving cars may enforce random path selection, improving performance without draconian-feeling laws. These are both “greater good” solutions, which assume drivers’ willingness to sacrifice their own best interests for the interests of the group. Either of these solutions provide an incentive for drivers to cheat - after all, the shortcut is faster so long as there are only a few people using it.

Another option is limiting information access. The entire problem hinges on the assumption that users know the to-the-moment traffic information for each possible route, and plan their travel accordingly. Restricting user information to only warn about extreme congestion or traffic accidents effectively prohibits gaming the system, and forces random path selection.


Braess’s Paradox is an interesting problem where providing more limited information improves performance for all users. Are there parallels in other software problems? Any system where all nodes are controlled by the same entity can be configured for the “greater good” solution, but what about distributed models like torrenting, where nodes are controlled by many people?

In a torrenting system, users have an incentive to “cheat” by downloading chunks of files without doing their share and uploading in return. Consider changing the system so users do not know who has the chunks they need, and must made trades with various other nodes to acquire chunks, discovering after the fact whether it was what they were looking for. Users now must participate in order to acquire the data they want. This may slow the acquisition of data, since you can no longer request specific chunks, but it may also improve the total performance of the system, since there will be far more seeders uploading data fragments.

The performance detriment could even be alleviated by allowing the user to request X different chunks in their trade, and the other end must return the appropriate chunks if they have them. This limits wasteful exchanges, while still ensuring there are no leechers.

Fun thought experiment that I expect has many more applications.

Posted 8/8/17

Merkle’s Puzzle-box Key Exchange

Cryptography is fantastic, but much of it suffers from being unintuitive and math-heavy. This can make it challenging to teach to those without a math or computer science background, but makes it particularly difficult to develop a sense of why something is secure.

There are a handful of cryptographic systems however, that are delightfully easy to illustrate and provide a great introduction to security concepts. One such system is Merkle’s Puzzles.

The Problem

Similar to the Diffie-Hellman Key Exchange, the goal of Merkle’s Puzzle Boxes is for two parties (we’ll call them Alice and Bob) to agree on a password to encrypt their messages with. The problem is that Alice and Bob can only communicate in a public channel where anyone could be listening in on them. How can they exchange a password securely?

The Process

Alice creates several puzzle boxes (since she has a computer, we’ll say she makes several thousand of them). Each puzzle box has three pieces:

  1. A random number identifying the box
  2. A long, random, secure password
  3. A hash of parts 1 and 2

Each “box” is then encrypted with a weak password that can be brute-forced without taking too long. Let’s say it’s a five character password.

Alice then sends all her encrypted puzzle boxes to Bob:

Alice sending puzzle boxes

Bob selects one box at random, and brute-forces the five character password. He knows he has the right password because the hash inside the box will match the hash of parts 1 and 2 in the box. He then announces back to Alice the number of the box he broke open.

Bob sending back number of chosen box

Alice (who has all the unlocked puzzle boxes already) looks up which box Bob has chosen, and they both begin encrypting their messages with the associated password.

Why is it Secure?

If we have an eavesdropper, Eve, listening in to the exchange, then she can capture all the puzzle boxes. She can also hear the number of the puzzle box Bob has broken in to when he announces it back to Alice. Eve doesn’t know, however, which box has that number. The only way to find out is to break in to every puzzle box until she finds the right one.

Eve breaking every puzzle box

This means while it is an O(1) operation for Bob to choose a password (he only has to break one box), it is an O(n) operation for Eve to find the right box by smashing all of them.

This also means if we double the number of puzzle-boxes then the exchange has doubled in security, because Eve must break (on average) twice as many boxes to find what she’s looking for.

Why don’t we use Puzzle-boxes online?

Merkle’s puzzles are a great way of explaining a key exchange, but computationally they have a number of drawbacks. First, making the system more secure puts a heavy workload on Alice. But more importantly, it assumes the attackers and defenders have roughly the same computational power.

An O(n) attack complexity means Eve only needs n times more CPU time than Bob to crack the password - so if the key exchange is configured to take three seconds, and there are a million puzzle boxes, then it would take 35 days for that same computer to crack. But if the attacker has a computer 100 times faster than Bob (say they have a big GPU cracking cluster) then it will only take them 0.35 days to break the password. A more capable attacker like a nation state could crack such a system almost instantly.

If Eve is recording the encrypted conversation then she can decrypt everything after the fact once she breaks the puzzle box challenge. This means even the original 35-day attack is viable, let alone the GPU-cluster attack. As a result, we use much more secure algorithms like Diffie-Hellman instead.

Posted 7/17/17

Port Knocking

Port knocking is a somewhat obscure technique for hiding network services. Under normal circumstances an attacker can use a port scanner to uncover what daemons are listening on a server:

$ nmap -sV
Starting Nmap 6.46 ( ) at 2017-07-17
Nmap scan report for (XX.XXX.XX.XX)
Host is up (0.075s latency).
Not shown: 990 filtered ports
22/tcp   open   ssh        (protocol 2.0)
80/tcp   open   http       Apache httpd
443/tcp  open   ssl/http   Apache httpd
465/tcp  closed smtps
587/tcp  open   smtp       Symantec Enterprise Security manager smtpd
993/tcp  open   ssl/imap   Dovecot imapd

Note: Port scanning is illegal in some countries - consult local law before scanning others.

Sometimes however, a sysadmin may not want their services so openly displayed. You can’t brute-force ssh logins if you don’t know sshd is running.

The Technique

With port knocking, a daemon on the server secretly listens for network packets. A prospective client must make connections to a series of ports, in order, without interruption and in quick succession. Note that these ports do not need to be open on the server - attempting to connect to a closed port is enough. Once this sequence is entered, the server will allow access to the hidden service for the IP address in question.

This sounds mischievously similar to steganography - we’re hiding an authentication protocol inside failed TCP connections! With that thought driving me, it sounded like writing a port-knocking daemon would be fun.

Potential Designs

There are several approaches to writing a port-knocker. One is to run as a daemon listening on several ports. This is arguably the simplest approach, and doesn’t require root credentials, but is particularly weak because a port scanner will identify the magic ports as open, leaving the attacker to discover the knocking combination.

Another approach (used by Moxie Marlinspike’s knockknock) is to listen to kernel logs for rejected incoming TCP connections. This approach has the advantage of not requiring network access at all, but requires that the kernel output such information to a log file, making it less portable.

The third (and most common) approach to port knocking is to use packet-sniffing to watch for incoming connections. This has the added advantage of working on any operating system libpcap (or a similar packet sniffing library) has been ported to. Unfortunately it also requires inspecting each packet passing the computer, and usually requires root access.

Since I have some familiarity with packet manipulation in Python already, I opted for the last approach.

The Implementation

With Scapy, the core of the problem is trivial:

def process_packet(packet):
        src = packet[1].src     # IP Header
        port = packet[2].dport  # TCP Header
        if( port in sequence ):
                knock_finished = addKnock(sequence, src, port, clients)
                if( knock_finished ):
                        trigger(username, command, src)
        # Sequence broken
        elif( src in clients ):
                del clients[src]

sniff(filter="tcp", prn=process_packet)

The rest is some semantics about when to remove clients from the list, and dropping from root permissions before running whatever command was triggered by the port knock. Code available here.

Posted 7/17/17

Decentralized Networks

A solution to the acyclic graph problem has been found! This post adds to the continuing thread on modeling social organizations with Neural Networks (post 1) (post 2) (post 3)

The Dependency Problem

The issue with cyclic neural networks is dependencies. If we say Agent A factors in information from Agent B when deciding what message to transmit, but Agent B factors in messages from Agent A to make its decision, then we end up with an infinite loop of dependencies. One solution is to kickstart the system with a “dummy value” for Agent B (something like “On iteration 1, Agent B always transmits 0”), but this is clunky, difficult to perform in a library like Tensorflow, and still doesn’t mesh well with arbitrary evaluation order (for each iteration, do you evaluate A or B first?).

Instead, we can bypass the problem with a one-directional loop. The trick is as follows:

  1. Agent A0 sends a message (not dependent on B)
  2. Agent B0 receives A0’s message, and decides on a message to send
  3. Agent A1 receives B0’s message, and factors it (along with the messages A0 received) in to deciding on a message to send
  4. Agent B1 receives A1’s message, and factors it (along with the messages B0 received) in to deciding on a message to send

We have now created a dependency tree where A can rely on B’s message, and B can rely on the message generated in response, but all without creating an infinite loop. When judging the success of such a network, we look only at the outputs of A1 and B1, not their intermediate steps (A0 and B0).

If it’s absolutely essential you can create a third layer, where A2 is dependent on the message sent by B1, and so on. As you approach an infinite number of layers you get closer and closer to the original circular dependency solution, but using more than two or three layers usually slows down computation considerably without yielding significantly different results.

Social-Orgs, Revisited

With the above solution in mind, let’s re-evaluate the previous social-group problems with two layers of Agents instead of one. Each layer can send all of the data it’s received, with no communications cost or noise, to its counterpart one layer up. This effectively creates the A0 and A1 dynamic described above. When evaluating the success of the network we will look at the accuracy of environmental estimates from only the outermost layer, but will count communications costs from all layers.

Multilayer simple trial

Tada! A social organization that doesn’t revolve around A0 reading every piece of the environment itself!

Note: In the above graph, most nodes only have a 0 or 1 layer, not both. This is because the other layer of the agent does not listen to anything, and is not shown in the graph. More complex examples will include both layers more frequently.

The result is still unlikely - all information passes through A2 before reaching the agents (even A0 gets information about three environment nodes through A2) - but it’s already more balanced than previous graphs.

Next Steps

A better evaluation algorithm is needed. With the two-layer solution there is no longer a requirement for centralization - but there is no incentive for decentralization, either. A real human organization has not only total costs, but individual costs as well. Making one person do 100 units of work is not equivalent to making 10 people do 10 units of work. Therefore, we need a cost algorithm where communications become exponentially more expensive as they are added to a worker. This should make it “cheaper” to distribute labor across several workers.


This post is based off of research I am working on at the Santa Fe Institute led by David Wolpert and Justin Grana. Future posts on this subject are related to the same research, even if they do not explicitly contain this attribution.

Posted 7/9/17

Ruggedized Networks

This post adds to my posts on modeling social organizations with Neural Networks (post 1) (post 2)

The Problem

The original model defined the objective as minimizing communications costs while getting an estimate of the environment state, and sharing that estimate with all nodes. This objective has a flaw, in that it is always cheaper to let a single agent read an environment node, making that agent a single point of failure. This flaw is exacerbated by the Directed Acyclic Graph limitation, which means since A0 must always read from the environment, it is always cheapest to have the entire network rely on A0 for information.

An Attack

I recently worked on developing a welfare function emphasizing robustness, or in this case, the ability of the network to complete its objective when a random agent is suddenly removed. The result should be a network without any single points of failure, although I am not accounting for multi-agent failures.

The result is as follows:

Diagram of robustness trial

In this diagram, all agents receive information from A0. However, most of them also receive information from A2, which receives information from A1, which is monitoring the entire environment. As a result, when A0 is disabled, only nodes A3 and A5 are negatively effected.

How it Works

To force robustness I created eleven parallel versions of the graph. They have identical listen weights (the amount any agent tries to listen to any other agent), and begin with identical state weights (how information from a particular agent is factored in to the estimate of the environment), and identical output weights (how different inputs are factored in to the message that is sent).

The difference is that in each of these parallel graphs (except the first one) a single Agent is disabled, by setting all of its output weights to zero. The welfare of the solution is the average of the welfare for each graph.

But why are the state weights and output weights allowed to diverge? Aren’t we measuring ten completely different graphs then?

Not quite. The topology of the network is defined by its listen weights, so we will end up with the same graph layout in each configuration. To understand why the other weights are allowed to diverge, consider an analogy to a corporate scenario:

You are expected to get reports from several employees (Orwell, Alice, and Jim) and combine them to make your own final report. When you hear Jim has been fired, you no longer wait for his report to make your own. Taking input (which isn’t coming) from Jim in to consideration would be foolish.

Similarly, each graph adjusts its state weights and output weights to no longer depend on information from the deleted agent, representing how a real organization would immediately respond to the event.

Then why can’t the listen weights change, too?

This model represents instantaneous reaction to an agent being removed. While over time the example corporation would either replace Jim or restructure around his absence, you cannot instantly redesign the corporate hierarchy and change who is reporting to who. Meetings take time to schedule, emails need to be sent, and so on.

Next Steps

This objective is still limited by the acyclic graph structure, but provides a baseline for valuing resiliency mathematically. Once the acyclic problem is tackled this solution will be revisited.


This post is based off of research I am working on at the Santa Fe Institute led by David Wolpert and Justin Grana. Future posts on this subject are related to the same research, even if they do not explicitly contain this attribution.

Posted 7/6/17

Cryptocurrency Tutorial

I was recently asked to give a talk on bitcoin and other related cryptocurrencies. My audience was to be a group of scientists and mathematicians, so people with significant STEM backgrounds, but not expertise in computer science. In preparation for giving my talk, I wrote this breakdown on the ins and outs of cryptocurrencies.

UPDATE 7/11/17

I gave the talk, it went great! Slides here [PDF].


What is Bitcoin?

Bitcoin is a decentralized currency. There is no governing body controlling minting or circulation, making it appealing to those who do not trust governments or financial institutions like Wall Street.

Where as most currencies have a physical paper representation, bitcoin is exchanged by adding on to a “blockchain”, or a global ledger of who owns what pieces of currency, and what transactions were made when.

Where does the value of Bitcoin come from?

Bitcoin is a fiat currency - Its value comes exclusively from what people are willing to exchange it for. This seems ephemeral, but is not uncommon, and is the same principle behind the value of the US dollar, at least since the United States left the gold standard.

Is Bitcoin anonymous?

Yes and no. All bitcoin transactions are public, and anyone can view the exact amount of money in a bitcoin wallet at any given time. However, bitcoin wallets are not tied to human identities, so as long as you keep the two distinct (which can be challenging), it is effectively “anonymous”.

How is Bitcoin handled legally?

Some countries consider bitcoin to be a currency (with a wildly fluctuating exchange rate), while others regard it as a commodity with an unstable value. Most countries will tax bitcoins in some way or another, but due to the aforementioned anonymity it is easy to avoid paying taxes on bitcoins.

What is the blockchain?

The blockchain is a technology solving two problems:

  1. How do we know who has what currency?
  2. How do we prevent someone from spending currency that isn’t theirs?

The second problem includes preventing someone from “double-spending” a bitcoin they legitimately own.

A blockchain is a sequence of “blocks”, where each block holds “facts”. These facts describe every transaction of bitcoins from one person to another. To make a transaction, you must create a block describing the transaction, and convince the majority of the nodes in the bitcoin blockchain to accept your transaction.

What does a block consist of?

A block has four fields:

  1. A string describing all contained facts
  2. The identifier of the previous block in the blockchain (maintains an explicit order for all transactions)
  3. A random string
  4. The SHA256 hash of all of the above

A block is accepted in to the blockchain if and only if the SHA256 hash starts with at least n leading zeroes. This makes generating a block equivalent to hash cracking (keep changing the random string until you get the hash you want), and the larger n is, the more challenging the problem is to solve.

For example, if n=5:

A losing block hash (will be rejected):
A winning block hash (will be accepted):

The number of leading zeroes n is increased periodically by group consensus so that even as more people begin to work on generating blocks, the rate of new blocks remains approximately constant (~one every ten minutes). This makes it extremely unlikely that two new and valid blocks will be generated near the same time, and therefore creates a continual chain of events making double-spending impossible.

Looking for a new valid block is colloquially referred to as “bitcoin mining”.

Note: The hashing algorithm (sha256) is specific to bitcoin. Other cryptocurrencies may use different hashing algorithms to discourage the use of GPUs in mining.

Can I spend someone else’s coins by mining a block?

Bitcoins are tied to a “bitcoin wallet”, which is a public/private keypair. To send coins to a new wallet you must make a blockchain fact describing a transfer of X bitcoins from one wallet’s public key to another, signed with the private key of the originating wallet. Therefore unless you have access to the private key, you’ll be unable to control the bitcoins associated with it.

Why would anyone mine blocks?

Each successfully mined block yields the miner some currency. They include their own wallet address as one of the facts in the block, and receive a fixed amount of currency (25BTC for bitcoin) at that address. This is also why you must pay a small transaction fee to send anyone a bitcoin - you are asking someone to include your transaction in their massive mining effort.

Doesn’t this mean there are a fixed number of bitcoins in the world?

Some readers may have noticed that SHA256 has a fixed length (256-bits, or 32 characters). If we periodically increase n, then eventually we will require that all 32 characters of the hash be “0”, which will make adding to the end of the blockchain impossible. Since you receive 25 bitcoins for each mined block, this puts the maximum number of bitcoins at about 21 million.

This upper limit poses a number of problems. There are a finite number of transaction blocks, after which all bitcoins will be unmovable, and therefore worthless. There are a finite number of bitcoins, so if you send some to a non-existent address, or forget your private key, those coins are effectively destroyed forever. This, along with commodity speculation, is responsible for the incredible fluctuation in the value of bitcoin.

Trust Issues

One problem with a decentralized currency like bitcoin is that there is no revocation of money transfers. With a bank, you can make a purchase with a credit card, and later dispute that purchase, claiming you did not receive what you paid for, and the bank can reverse the charge. You can also use banks and lawyers to create contracts, agreeing to pay a certain amount before a service is rendered and a certain amount after, with other complications like security deposits.

None of this infrastructure exists with bitcoin, making it an extremely scam-prone transaction system. Some people use escrow services, but these are all very ad-hoc. This is also one of the reasons bitcoin is commonly used in ransomware attacks, or for purchases of drugs or stolen property on the “deep web”.

What about alt-coins?

There are several variations on bitcoin, called “alternative-coins” or “alt-coins”. Some of the most interesting are:


Namecoin treats the blockchain as an extremely distributed database of information tied to specific identities. It’s effectively the same as bitcoin, except in addition to storing “coins” with particular wallets, you can store domain names, email addresses, public encryption keys, and more.

In theory, this removes the need for centralized DNS servers, or domain-registrars for the Internet. Everyone can perform DNS lookups by looking for the domain name in question in the blockchain, and can transfer domains to each-other in exchange for namecoins.


Ethereum tries to solve the trust issues of bitcoin by allowing you to write programmatically-enforceable contracts and embedding them in to the blockchain.

Consider the following blockchain:

ABC Blockchain

Block A contains a program with psuedocode like the following:

if( security_deposit_received and date == December 5th and house_not_destroyed )
    send(security_deposit, from=Bob, to=Alice)
else if( date > December 5th )

When block A is added to the chain the code inside is evaluated by every node in the chain. The code is re-evaluated as each subsequent block is added, until after December 5th when the code can be safely ignored.

Block B contains a transfer of $1000 from Alice to Bob, as a security deposit.

On December 5th, if the house is not destroyed, the security deposit is automatically returned to Alice by Bob.

Ethereum therefore allows you to create contracts which are enforceable without lawyers or banks, and cannot be violated by either party once issued.

Other uses for Ethereum contracts include provably-fair gambling, and generic distributed computation, where you pay each participating node for running your application.

Ethereum suffers from a few issues:

  • The complexity makes it less approachable than Bitcoin
  • Without widespread cryptographically verifiable Internet-of-Things devices the types of contracts you can express are limited
  • All code is publicly viewable, but not changeable, so if someone finds a security hole in your code, it cannot be easily patched

Despite these limitations, Ethereum has much more functionality than other cryptocurrencies and is gaining in popularity.


The best cryptocurrency. It uses a logarithmic reward function, so the first few blocks yield many dogecoins, while later blocks yield fewer. This guarantees that lots of coins enter circulation very quickly, making it a viable currency immediately after launch. It also uses scrypt instead of sha256, and so doesn’t suffer from the same GPU and ASIC-mining problems plaguing bitcoin.

Dogecoin was started as a meme in 2013, but is collectively valued at over $340 million as of June 2017, which its user-base finds hilarious. However, because of the massive number of coins in circulation, a single dogecoin is only worth about $0.00095.

The Dogecoin community is particularly noteworthy for donating more than $30,000 to ensure the Jamaican bobsledding team could travel to the 2014 Winter Olympics.

Posted 7/5/17

Social networks should not be DAGs

This post continues the discussion of modeling social organizations using Neural Networks.

The Setup

Over the past week, I have been working with a neural network that represents a DAG of the following scenario:

  • There are ten agents (people, computers, etc) in an organization

  • There are five “environment” nodes, representing information in the world

  • Each agent can either listen to an environment node, or listen to messages from another node

  • There is a cost to all communication, as described in my previous post

  • Each agent is calculating the average of the five environment values

The neural network is attempting to minimize communications costs, while maximizing the accuracy of each agent’s information. The equation for loss looks like the following:

for each agent:
    difference += mean(environment) - mean(agent's estimate of each environment node)
    cost += listening_cost_for_agent + speaking_cost_for_agent
total_loss = (difference / num_agents) + cost

To start with, all agents will listen to each agent under them (A9 can listen to everyone, A8 to everyone except A9, …), and will listen to all environment nodes. The neural network will then remove edges until it reaches maximum efficiency. We’ll begin assuming messages between agents are just as expensive as listening to the environment.


In an ideal world, the solution to the problem is as follows:

Absolute Minima

This has the lowest total cost because each agent only has to listen to one other node, except for A0, who listens to all the environment nodes and distributes the information. However, the neural network regularly produces solutions like the following:

Local Minima

In this model, A0, A1, and A2 all read from the environment nodes, and everyone else listens to A2. Why?

This is a problem known as a local minima. As the neural network slowly took optimization steps (“learning”), it realized it could reduce cost if A1 stopped listening to A0. Unfortunately, this means A1 must listen to all the environment nodes to get an accurate estimate. If it wanted to listen to A0 instead it would first have to add an edge to listen to A0, which would increase cost, and is therefore unlikely to be chosen. This gets the entire network “stuck”.

Problems with the Model

The main issue leading to the above local-minima is that we are optimizing for an unnatural situation. With a directed acyclic graph, A0 must always listen to all the environment nodes, since it is forbidden from listening to any other agents. Therefore, we should not be optimizing for lowest communications cost, but for the fastest path from each agent to A0.

The model is also inaccurate in that it minimizes only total cost, not per-agent cost. If we grow the environment, to say a million nodes, it becomes ridiculous to suggest a single agent should be listening to the whole environment. A more accurate representation of the world would minimize the cost for each agent, distributing work, and possibly coming up with something closer to the following:

Communist Neural Network

This produces slightly higher total message costs, but each agent on the inner ring only needs to listen to two and send two messages, and those on the outer ring only need to listen to a single message.

Next Steps

Moving forward, we need a new model for network communication costs - preferably one not based on a DAG. From there we can develop other models, representing fragmented information (not every agent needs to know everything about the environment), redundancy problems, and so on.


This post is based off of research I am working on at the Santa Fe Institute led by David Wolpert and Justin Grana. Future posts on this subject are related to the same research, even if they do not explicitly contain this attribution.

Posted 6/23/17

Optimizing Network Structure

We want to develop an “optimal” network. Here we mean “network” in the scientific sense: Any graph structure from a corporate hierarchy, to a military chain of command, to a computer network apply. What is “optimal” is user-defined - maybe we want to move information as quickly as possible, maybe we want something that works correctly event when many nodes go offline.

Given such a broad problem, the only reasonable solution looks like machine learning. Let’s dive in.

What does the network look like?

Let’s define a network as a DAG, or Directed Acyclic Graph. This is a simplification, as it assumes all communications are one-directional, but this lack of cycles will make the network much easier to reason about.

So the network is a series of nodes, with each lower-level node able to send messages to higher-level nodes (represented as edges).

DAG example

What does communication look like?

Every message has two costs - one to send and one to receive. Let me justify that in two different scenarios, and we’ll say it generalizes from there.

Human Society:

If I want to communicate an idea I need to express myself clearly. Putting more time and thought in to how I phrase my idea, and providing background knowledge, can make it more likely that I am understood.

If you want to understand the idea I am telling you then you need to pay attention. You can pay close attention and try to follow every word I say, or you can listen/read half-heartedly and hope I summarize at the end.

Computer Network:

I am sending a message over a network with some noise. It could be a wireless network on a saturated frequency, a faulty switch, or just some very long wires with EM interference. I can counteract the noise by adding parity bits or some more complex error correction code. Doing so makes my message longer and requires more CPU time to encode.

Similarly, a receiver needs to decode my message, verify all the checksums, and correct errors. Therefore, transmitting a “better” message costs both bandwidth and CPU time on both ends of the connection.

How do we optimize?

This is already smelling like a machine learning “minimization” or “maximization” problem. All we need now is a continuous scoring function. Something like:

total_score = success_level - cost

From here, we could use a variety of tools like a Neural Network or Genetic Algorithm to maximize the total score by succeeding as much as possible with as little cost as possible.

The “success level” will be defined based on the constraints of the problem, and may be any number of arbitrary measurements:

  • How many nodes receive information X propagating from a seed node?
    • How many nodes can get three different pieces of information from different sources and average them?
  • How many nodes can be “turned off” while still getting information to the remaining nodes?
  • And many others…

Machine Learning Model

A DAG already looks pretty similar to a neural network, so let’s start by using that for our ML model. The model doesn’t quite fit, since neural networks have an “input” and “output” and are typically used for decision making and function approximation. However, the process of adjusting “weights” on each edge to improve the system over time sounds exactly like what we’re trying to do. So this won’t be a typical neural network, but it can use all the same code.

Dynamic Topology

All of the above sounds great for figuring out how much effort to put in to both sides of a communication channel. It even works for optimizing the cost of several edges in a network. But how can it design a new network? As it turns out, pretty easily.

Start with a transitive closure DAG, or a DAG with the maximum possible number of edges:

Transitive Closure DAG

For every edge, if either one side puts no effort in to sending their message, or the other side puts no effort in to receiving a message, then any message that gets sent will be completely lost in noise, and never understood. Therefore, if the receiver cost isn’t high enough on an edge, we can say the edge effectively does not exist.

Coming Soon

Future posts will talk about implementing the system described in this post, applications, and pitfalls.


This post is based off of research I am working on at the Santa Fe Institute led by David Wolpert and Justin Grana. Future posts on this subject are related to the same research, even if they do not explicitly contain this attribution.

Posted 6/14/17

Network Coding

I was recently introduced to the idea of Network Coding, a relatively obscure and rarely implemented technique for simplifying routing and increasing network throughput and security.

This post is an introduction to the theory, but there may be future posts on its application.

The Problem

We’ll start with the quintessential Network Coding problem - The Butterfly Network.

Assume we have data streams A and B, both of which must pass through the network to reach clients 1 and 2. The network topology looks something like the following:

Empty routing diagram

We’ll say that this is a simple network, where each node can only transmit one bit at a time. Therefore, we have a bottleneck: The first switch can either transmit data stream A, or data stream B, but not both at once. This means with traditional routing we must choose to prioritize either client 1 or 2, delaying the other.

Network Coding, however, provides an alternate solution. The first switch combines the two data streams, and transmits A+B to both clients. The result looks as follows:

Network Coding diagram

Now client 1 can reconstruct data stream B by subtracting A from A+B, and client 2 can similarly reconstruct A by subtracting B from A+B. As a result we have satisfied both clients using only one transmission, with no delays.

Layering Signals

The key to network coding, of course, is how to combine and separate data streams. In a computer science context, with bits and bytes, we can define the layering as xor, such that A+B means A^B. This is extremely convenient, as with the properties of xor we can reconstruct the original streams with B=A^(A^B) and A=B^(A^B).

In a more generic “signals” context, you can think of A+B as layering two waves on top of each other, then extracting either wave by subtracting the other.

Security Benefits

Network coding has a number of advantages besides solving the butterfly routing problem, but one I mentioned already was security. Layering data with network coding provides defense against some types of eavesdropping attacks, because given either A or A+B it is impossible to extract B. This makes it potentially advantageous to fragment your data and send it over multiple channels, making full message recovery difficult for an attacker.

Posted 6/6/17


Recently I was working on a MUD, or Multi-user-dungeon with a friend. Like many multi-player games, MUDs are vulnerable to scripting and cheating. To prevent cheating many MUDs rate-limit commands from users, or have a concurrent turn-based system, where events occur at set intervals regardless of when commands were entered.

But what about preventing users from scripting account registration? On the web we often use CAPTCHAs to prevent automation, so what if we could do…

Captchas on the command-line

We want to reproduce this:

Captcha example from Wikipedia

In a terminal like this:

| |    |                                      o      
| |  __|          __,  _  _         __,   __      _  
|/  /  |  |   |  /  | / |/ |  /\/  /  |  /  \_|  |/  
|__/\_/|_/ \_/|_/\_/|/  |  |_/ /\_/\_/|_/\__/ |_/|__/
Answer: ldugnxaoie

Generating a captcha as ASCII art is pretty easy using figlet. The whole thing comes out to:

#!/usr/bin/env ruby

Fonts = ["small", "mini", "script", "standard", "slant", "banner"]
Letters = ('a'..'z').to_a.shuffle[0,rand(8..12)].join
Text = `figlet -f #{Fonts.sample(1)[0]} #{Letters}`

puts "Captcha:"
puts "#{Text}"
print "Answer: "
response = gets
unless( response.nil? )
if( response == Letters )
        puts "Correct!"
        exit 0
        puts "Incorrect."
        exit 1

And there’s my terrible idea for the day.

Posted 5/25/17

SnailDoor, the Socketless Backdoor

Imagine a system where users can ssh in, but once logged in cannot create any sockets (or at least, all connections are blocked by a firewall). So you can run code on the system, but can’t create network services for anyone else without giving them your password.

However, there is an instance of Apache running, with home directory hosting enabled. It only supports static files, but surely we can tunnel through this somehow? Enter SnailDoor, which implements a network shell as a crude proof of concept for the technique.


SnailDoor creates 256 files in the web hosting folder, one for each potential byte. It then records the file access time, and polls all the files a few times a second to see if the access time has changed.

for bf in byteFiles:
    newtime = os.path.getatime(bf.path)
    if( newtime != bf.accesstime ):
        shellBuffer += [bf.byte]

If the access time has changed for a file, SnailDoor adds the corresponding byte to its buffer. It continues polling in a loop until it reads a newline, at which point it executes the buffer as a shell command, and saves the results to output.txt, also in the web hosting folder.

The client can now write a character at a time to the server by making a GET request to http://somewebsite/byte.txt, as follows:

for char in list(cmd):
    filename = str(ord(char)) + ".txt"
    urllib2.urlopen(url + "/" + filename).read()


With a trivial implementation, SnailDoor is limited to one-byte-per second in the to-server direction, or 8-baud. This is because file access timestamps are stored in epoch time, which has an accuracy of one second. If multiple files are accessed in a second then each will have the same access time, making the byte order impossible to determine.

However, there are optimizations to stretch this limit. If we create a second set of 256 files we can represent even and odd bytes, increasing bandwidth to 2 bytes per second. Obviously this is an O(n) solution, and with 38400 files we can reach 1200 baud, which is fast enough for a decent interactive shell.

The largest limitation of SnailDoor is that it relies on the accuracy of access timestamps. These are an inherently inefficient part of the filesystem, since they require a write to disk every time a file is read. As a result, many sysadmins disable access time recording, so access time is only updated when a file is written to.

Posted 3/7/17

The Diffie-Hellman Key Exchange Made Simple

To send encrypted messages to one another we need one of two things:

  1. Public/Private Keypairs

  2. A Shared Secret

The Diffie-Hellman Key Exchange is a technique for generating a shared secret securely, even over an insecure channel where someone may be listening in. I have found many explanations of Diffie-Hellman to be needlessly complex, and so this post will attempt to describe the algorithm simply and succinctly.

The Necessary Math

Diffie-Hellman is based on two principles. Understanding both is essential to understanding why Diffie-Hellman works, and why it is secure.

  • Given a, n, and ax mod n, it is hard to determine what x is

  • axy == ayx

The first is true because assuming x is reasonably large it will take a long time to discover by brute force, and the modulus prevents any kind of sneaky shortcut like trying big and small numbers to close in on the correct value.

The second is true because axy == ax*y by the power rule, and ax*y == ay*x because multiplication is commutative.

The Exchange

Alice and Bob want to send secret messages back and forth, and must first generate a shared secret. The process is as follows:

  1. Alice and Bob agree on a base g, and a modulus n, where n > g

  2. Alice chooses a secret number x, and A = gx mod n

  3. Bob chooses a secret number y, and B = gy mod n

  4. Alice sends Bob A

  5. Bob sends Alice B

  6. Alice calculates s = Bx mod n

  7. Bob calculates s = Ay mod n

Both sides now have a shared secret s, which is equal to gxy mod n, and is equal to gyx mod n.

From here, s can be used as the key for a number of symmetric ciphers like AES.

Why is it secure?

No one ever sends x or y across the network. Without at least one of these values, an eavesdropper can never learn what gxy mod n is.

This means two people can generate a shared secret even in the presence of a passive attacker, and still remain secure.

When is it not secure?

Diffie-Hellman key exchanges are vulnerable to active attackers. This means if someone blocks the connection between Alice and Bob, they can introduce their own secret and man-in-the-middle all messages between Alice and Bob.

Charlie generates their own secret number z, and C = gz mod n. They send C to Alice and Bob.

Alice and Charlie then generate s1 = gxz mod n, while Bob and Charlie generate s2 = gyz mod n.

Charlie can now intercept all the messages between Alice and Bob, decrypt them with s1, and re-encrypt them with s2.

Posted 1/10/17


Puddle is a prototype anonymity network designed and implemented by a friend (Taylor Dahlin) and myself. It lacks any cryptography, but explores how to perform anonymous routing and decentralized information lookups. Below is an adaptation of our research paper describing the system.


Information online is shared in two distinct styles. The first is in terms of specific data: A single email, or a URL to an exact webpage. This form is convenient for storing information, but makes finding information tedious. The second style of information sharing, a search engine, makes use of non-specific information. A user requests data on a topic, and the search engine returns all relevant specific data.

Information in anonymity networks, or “darknets”, is almost exclusively shared using the specific file style. Search engines seem to be the antithesis of anonymity, as a traditional search engine maintains an index of all information on a network and where it is located, and is also in a perfect position to track what data users are interested in. Unfortunately this makes discovering information on networks like Tor and I2P frustrating and unapproachable for even technical users.

Puddle is an attempt at a distributed and anonymous search engine system. Similar to an engine like Google a client can send out a request for information on a particular subject, and the network will return relevant files. However, unlike public search engines Puddle has no central index of information, or any bottlenecks where requests can easily be traced to their origin.


Puddle is implemented as an HTTP API. This provides a simple framework for requests and responses. GET requests represent requests for information, while PUT requests represent uploads of data related to the GET subject.

HTTP requests are rippled through connected nodes using a “time-to-live” to ensure that requests do not bounce indefinitely, and do not require a specific route.

Each information request is sent with two time to live values, formatted as follows:

GET /relay/:ttlOrig/:ttlCurrent/:topic

The “current” TTL acts like normal: It is decremented at each hop through the network, and the message is discarded if the TTL reaches zero.

The “original” TTL is used for responses, telling each relay how high a TTL it needs to set to guarantee its message will reach the source. Both time-to-lives are slightly randomized, so it is impossible to determine where a message originated unless an assailant controls all neighbors of the originating relay.

Responses to information requests are formatted as:

PUT /relay/:ttl/:topic/:filename

The response uses only a single time to live, responds to a specific topic, and specifies the appropriate filename. The content of the PUT request is the file data itself.

In both requests the topic and filename are base64 and then URL encoded, to ensure that malicious filenames or topics cannot be chosen to malform the URL.


We implemented Puddle in Ruby for rapid prototyping in a loose-typed language. Our implementation is broken into five segments:

  • Request Processing - Receives HTTP requests for information or responses with information

  • File Processing - Manages files being shared, downloaded, or cached

  • Signalling - Sends new requests or forwards the requests from other relays

  • Client - Handles interaction with the user via a web interface

  • Peeting - Manages adding new peers, removing defunct ones, and integrating into the larger network

Request Processing

We use Sinatra (an HTTP server library) to register handlers for each type of request. These handlers then pass off work to the file or signalling modules as needed.

File Processing

The file processing module reads the files in a “data” folder and determines what topics they are related to. When the module receives a request it checks whether there is relevant data, and if there is sends those files to the Signalling module. It also manages a data cache of files that have passed through the relay recently. This ensures that information that is frequently requested on the network propagates and becomes faster to fetch in the future.


This module sends HTTP requests to other relays. It acts as a thread pool so that requests can pile up and be sent incrementally. Otherwise this module is mostly a wrapper around different HTTP requests. We use the “patron” library to send HTTP requests for us, simplifying the networking requirements substantially.


The client module is similar to the “request processing” module except that it maintains an internal website for human users to input requests, retrieve results, and view the current state of the node. This part of the website is restricted so it can only be accessed from localhost, providing a modicum of authentication.


Our implementation of Puddle is only a proof of concept, and has many shortcomings. However, it achieves its goal of creating a decentralized and anonymous search system. We hope that other scientists see our design and build off of it to create a more complete network.

Further Work

There are several areas our implementation does not yet touch on. First, we do not use cryptography. Encryption is not strictly necessary for anonymity in Puddle, as privacy is protected by randomized time-to-live values that conceal the origins of messages. However, encryption is necessary to protect against a Sybil attack, as we would need a method similar to Pisces’ cryptographically signed routing tables to detect malicious relays.

Puddle is also vulnerable to denial of service attacks. Since each message ripples from a relay to all of its peers there is a great deal of bandwidth used, and on tightly linked networks there is a high level of message duplication. One potential solution is to use random walks down a small number of neighboring relays, rather than broadcasting messages to everyone. This limits the bandwidth used, but also limits how many relays will be reached, potentially missing hosts with relevant content. Relays would need to re-transmit requests so long as the user is still interested to ensure that the likelihood of reaching a relevant relay is high.

Finally, our implementation uses a trivial definition of “relevant” content, and determines which files to up-load solely based on the filenames. A next step would be implementing some type of tagging system, or applying natural language processing to files to automatically determine what content is relevant to a request. Such a solution would also need to account for file name collisions, and how to handle extremely large files that could clog the network.

Posted 11/14/16

Traceroute is Black Magic

Traceroute is a utility for detecting the hops between your computer and a destination server. It is commonly used for diagnosing network problems, and in conjunction with ping makes up the majority of ICMP traffic.

It is also a piece of black magic that exploits the darkest turns of networking to succeed.

Traceroute Overview

At its core, traceroute is simple:

  • Send a packet to the destination with a time-to-live of 1

  • When the packet inevitably fails to deliver, the furthest computer it reached sends back a “time exceeded” packet

  • Now try again with a time-to-live of 2…

This continues until a packet finally does reach the destination, at which point the traceroute is finished. The time-to-live of the successful packet tells you how many hops away the destination server is, and the hosts the “time exceeded” packets come from gives you the address of each server between you and the host. The result is the familiar output:

% traceroute
traceroute to (, 64 hops max, 52 byte packets
 1 (  1.274 ms  0.944 ms  0.897 ms
 2 (  19.004 ms  8.942 ms  8.451 ms
 3 (  9.279 ms  9.269 ms  8.939 ms
 4 (  9.546 ms  9.101 ms  9.935 ms
 5 (  9.197 ms  9.214 ms  9.443 ms
 6 (  12.564 ms (  11.646 ms (  13.703 ms
 7 (  12.517 ms  12.109 ms  14.443 ms
 8 (  12.600 ms  12.057 ms (  12.915 ms
 9 (  12.188 ms  12.207 ms  12.031 ms
10 (  13.493 ms  13.934 ms  13.812 ms
11 (  12.241 ms  13.006 ms  12.694 ms
12 (  12.841 ms  12.563 ms  12.410 ms

Well that doesn’t look so bad! It gets a little messier if there are multiple paths between you and the host, particularly if the paths are of different lengths. Most traceroute implementations deal with this by sending three different probes per TTL value, thus detecting all potential paths during the scan. You can see this under hops ‘6’ and ‘8’ in the above example.

The real dark side is in how to support multiple traceroutes at once.

How does the Internet work, anyway?

When you send a TCP or UDP packet it includes four key pieces of information:

  • Source IP (so the other side can write back)

  • Source Port (so you can distinguish between multiple network connections, usually randomly chosen)

  • Destination IP (so your router knows where to send the packet)

  • Destination Port (so the server knows what service the packet is for)

TCP also includes a sequence number, allowing packets to be reorganized if they arrive in the wrong order. UDP opts for “drop any packets we don’t get in order.”

However, ICMP is a little different. It includes the source and destination addresses, but has no concept of a “port number” for sending or receiving.

Traceroute can send probes in TCP, UDP, or ICMP format, but it always receives responses as ICMP “TIME EXCEEDED” messages.

Parallel Traceroute

So if ICMP responses don’t include port numbers, how can your computer distinguish between responses meant for different traceroutes?

The trick is in a minor detail of the ICMP specification. For Time Exceeded messages the packet includes a type (11 for time exceeded), code (0 or 1 depending on the reason the time was exceeded), a checksum, and the first 8 bytes of the original packet.

UDP is just small enough that the first 8 bytes of the UDP header include the source port. Thus if we choose our source port for our probe carefully we can use this same number as an ID received in the ICMP response. This requires creating our own raw packets (as root) so we can select a source port, and then parsing the bytes of the ICMP response ourselves to extract the ID.

On FreeBSD the traceroute program is setuid root for this purpose, and it uses its own process-ID to select an unused source port for its probes. To quote the FreeBSD implementation source code:

Don’t use this as a coding example. I was trying to find a routing problem and this code sort-of popped out after 48 hours without sleep. I was amazed it ever compiled, much less ran.

And yet it does run, and has run this way for more than 20 years.

Why do you know this? What’s this arcane knowledge good for?

I implemented traceroute in Python. It was part of a larger project to detect critical hub systems across the Internet, which may be deserving of its own article once I have more conclusive data. The point is I needed to run a lot of traceroutes simultaneously, and doing it myself with multithreading gave me better access to data than trying to parse the output from the traceroute program over and over.

Posted 8/17/16

Let’s talk about touch-tone telephones!

You remember the soft beeping of old phones when you punched in numbers? I wonder how those work! Sounds like a good premise for a project. Let’s write a script that takes a bunch of numbers and outputs the appropriate tones, and no cheating by saving off recordings of the beeps.

Signal Overview

Dual Tone Multiple Frequency signals are a way of transmitting computer-readable data over analog lines. It works by combining two different sine waves to create a single tone:

  1209 Hz 1336 Hz 1477 Hz 1633 Hz
697 Hz 1 2 3 A
770 Hz 4 5 6 B
852 Hz 7 8 9 C
941 Hz * 0 # D

Why does it work this way? Ruggedization. Analog audio equipment can have weird distortions and pitch shifts, and if we used a single sine wave for each digit then those distortions could lead to misinterpreting a signal. Very awkward if you dial a phone number and it connects to the wrong person. By creating a dual-tone signal each digit is extremely distinct from one another and are almost impossible to confuse.

What are those “ABCD” digits? My phone doesn’t have them!

Those are used within the telephone company for some internal controls. They’re interesting, and you can read more about them here, but they’re outside the scope of my project.


If you’re like me and you forgot your trigonometry, don’t worry! Adding two waves is very simple. Sine waves output between -1 and 1. A higher frequency means the wave repeats itself more quickly, while a higher wavelength (and lower frequency) means the waves are more drawn out.

We can add two waves by layering them over each other on a graph, and adding their points together. If one wave is at -1, and the other is at 0.25, then the new wave has a point of -0.75. The effect looks like this:

Adding waves

The Code

At this point all we have left is implementation. I went at this in a clunky way:

  • Figure out how long our tone needs to be

  • Choose the two waves we need to make up a digit

  • Create an array of outputs, where each entry is the noise of the digit at a specific point

  • Convert the array of outputs to a wave file

The code is fairly simple, and all up on GitHub. The core is calculating the value of a wave at a certain point in its period:

Math::sin(position_in_period * TWO_PI) * max_amplitude

The amplitude (height of the wave) determines the volume of the tone.

After that, all that’s left is using the Ruby Wavefile Gem to encode the array of samples as noise and save it to disk, and then a little glue code to parse user input and trigger all the calculations.

One day this might have been an awesome tool for Phone Phreaking, but today most of the telephone network is digital, and there’s only so much you can do with a tone generator. So it goes.

Posted 8/16/16