I wanted to implement concise “pattern matching” in Python, a language which unlike C#, F#, Scala, and so on, does not have any pattern matching built in. Logically a pattern is just a predicate: a function which takes a value and returns true if the value “matches” the pattern, false otherwise. The code for this episode is here.
When I embarked on this I realized two things. First, that it might be nice for debugging purposes to have more information in the return value than just “true” or “false”; in particular, when a complex pattern fails to match, but it should match, then I’ve made a mistake. In order to debug that mistake, I actually made patterns return a new type:
class MatchResult(ABC):
test: Any
submatches: Dict[str, "MatchResult"]
# ... and so on, boilerplate code for the rest of this.
A complex pattern might fail because of its component patterns failing to match, and so that information is included in the failure result. I also made subclasses Success
and Fail
.
You know I just realized this very moment that I’ve used “Fail” as a noun here. It should have been Failure
. Oh well.
Second, I realized that using functions for everything would lack concision. I didn’t want to write code like this throughout:
is_binop = lambda x: Success(x) if isinstance(x, ast.BinOp) else Fail(x)
The notion of “is this thing a binop” is captured entirely by the type BinOp already; could we just use that object where a pattern is required?
After playing around with different options I eventually landed on this type discipline:
class PatternBase(ABC):
@abstractmethod
def match(self, test: Any) -> MatchResult:
pass
def __call__(self, test: Any) -> MatchResult:
return self.match(test)
Pattern = Union[PatternBase, int, str, float, type, list, None]
A pattern base object has a match predicate which takes a test value and, moreover, may be treated as a function you can call.
Anywhere that we need a pattern, the caller may provide:
- an instance of pattern base
- an None, integer, float or string value; the semantics are “the pattern matches if test is equal to the given value”.
- a type; the semantics are “the pattern matches if test is of the given type”
- a list of patterns; the semantics are “the pattern matches if test is a list of the same length, and each list element matches the corresponding pattern”
So far we can express patterns such as
p = [ BinOp, 3, None ]
# p matches a list of three items where the first is a binary op AST,
# the second is the number 3, and the third is the value None
But how would we express “match instances of Foo
whose bar
attribute is 3″? We can start by solving the more general problem of how do we match against multiple patterns? Here’s our first combinator:
def match_every(*patterns: Pattern) -> Pattern:
# ... the implementation is tedious to make it efficient,
# but you get the idea; we produce a pattern which succeeds if
# every given subpattern succeeds, and fails otherwise.
We can solve the specific problem of matching a subpattern that is an attribute of the tested value:
class AttributeSubpattern(PatternBase):
name: str
subpattern: Pattern
def match(self, test: Any) -> MatchResult:
submatch = match(self.subpattern, getattr(test, self.name, None))
submatches = {self.name: submatch}
if submatch.is_success():
return Success(test, submatches)
return Fail(test, submatches)
# ...
It’s a little clunky to call AttributeSubpattern
so I made a synonym function attribute
as well. Notice that this class is a combinator; it takes a pattern and produces a new pattern.
And now we can make a third combinator out of the first two: I want a pattern that checks to see if an object is of a particular type, and whether it has an attribute which matches a pattern:
def type_and_attributes(typ: type, patterns: Dict[str, Pattern]) -> Pattern:
t: List[Pattern] = [typ]
tuples: List[Pattern] = [
attribute(name, subpattern) for (name, subpattern) in patterns.items()
]
return match_every(*(t + tuples))
And now we can say
is_addition = type_and_attributes(BinOp, {"op": Add})
Or, even better, let’s make a pattern that always matches and yet another combinator:
class AnyPattern(PatternBase):
def match(self, test: Any) -> MatchResult:
return Success(test)
_any = AnyPattern()
def binop(
op: Pattern = _any,
left: Pattern = _any,
right: Pattern = _any) -> Pattern:
return type_and_attributes(
BinOp, {"op": op, "left": left, "right": right})
is_addition = binop(Add)
I also defined patterns or combinators for:
- Match at least one of these patterns
- Match the first item in a list to one pattern and the tail of the list to another
- Match at least one item in a list to a given pattern
- Match all items in a list to the same pattern
- Match the opposite of this pattern
- Helper combinators for commonly used AST nodes such as identifiers, operators, and so on.
- And so on; you get the idea.
With these tools in my toolbox I could then rapidly create patterns such as “is any item in this list not an identifier?”
any_non_id = ListAny(negate(name()))
Patterns identify what code needs to be transformed; rules transform them. Next time on FAIC we’ll look at building a set of rules and rule combinators for concisely representing AST transformations.