It's already weeks since PyCon! Phew, I've been busy recently.
Anyway, I had an eventful trip to PyCon in Santa Clara, California. PyCon
is the biggest Python conference with about 2500 delegates from all around
the world (though most seemed to come from the US).
I had chats with Guido, Armin Rigo (PyPy) and many others. After the
conference, I stayed around for a few days in the San Francisco Bay Area
and gave a talk at Google Mountain View, and also visited Dropbox in San
Francisco.
One of my main goals for the trip was trying to gauge
whether mypy is going in the right
direction in the eyes of the Python community.
There was a lot of interest in the project,
but some important issues were raised that I need to discuss in more
detail.
Ability to do compile-time checking of programs even without a new VM was
interesting to many. This would benefit projects and organizations with large
existing Python code bases.
However, these organizations also manage risks carefully.
Currently mypy can be used on top of CPython, but the sources must always be
translated to Python before execution.
Adding the mypy tool chain to the core build process is something
most seem to be reluctant to do. Obviously this is the case now as mypy is
still experimental, but
I got the impression that even if mypy would be considered stable and mature,
relying on a third-party tool to be able to run their code would be
a pretty daring and unlikely move.
Also, mypy has the problem of being not-quite compatible with
many Python tools such as IDEs. This is a chicken-an-egg problem: tool support
probably would fix itself if mypy would be widely used, but it's difficult to
get wide use without tool support. Library support is similar. However, there
may be a way around this dilemma -- just stay with me for a few more
paragraphs.
Many organizations using Python are still stuck with 2.x, and find
the transition to Python 3 difficult. Even upgrades from 2.x to 2.x+1 have
caused a lot of trouble, and the switch to Python 3 is much trickier,
in large part due to changes in string representations (str/unicode in Python
2.x versus bytes/str in Python 3.x). Mypy currently only supports Python 3.x
syntax, which limits its usefulness to many.
Some also saw the challenge of developing a production-quality mypy VM to
be too large for
our team. I think this is to a large part down to how previous projects have
succeeded (or not), including PyPy:
even after many years, and with several talented developers, still
their adoption has been pretty slow in the Python community.
Unladen Swallow is
another example that showed that speeding up Python is not easy. Of course,
mypy has goals different from PyPy and other previous projects, and our
approach of targeting ahead-of-time
compilation slashes development efforts by a large factor. But I agree that
I won't be able to it alone, and getting funding for continued development is
hard.
Based on suggestions from Guido and the above observations,
I've worked now for some time on a pretty big
proposal that would help address all of the above issues in some form or
another. This is still in a planning stage, and no concrete plans are
yet finalized. However, here are the main points:
- For mypy to really take off, we need users. In order to realistically
get users, there needs
to be a low-risk way of adopting mypy incrementally in current
projects implemented in Python.
- There is a good amount of interest in optional typing in the Python
community,
but the approach should be non-invasive to current development
processes, tool chains, etc.
- The pragmatic way to resolve the two above issues is to make mypy
syntax 100% compatible with Python, both Python 2.x and 3.x. There
would be no need for a Python translation phase, and a normal Python
interpreter could be used to run mypy programs directly. Also all Python
tools would pretty much Just Work. Note that as this would be a syntactic
change, it would have no significant impact on planned efficiency of the
new VM compared to the current syntax and plans, though this would likely
result in semantic changes as well (see below for more about these).
Also, mypy already supports translation to Python. This would just remove
the need for the translation step.
- We should first focus most resources on the optional typing part instead
of the the new VM and compiler in order to make
mypy usable as a static type checker for CPython (and PyPy/Jython).
- Now mypy would be much easier to adopt in organizations that would
like to use optional typing to get better maintainability and productivity.
I think that the above changes could speed
up the adoption of mypy a lot. Also, the type checker part of mypy
is a fairly straightforward project form an engineering point of view and
there is no need for a large team of developers.
- If mypy gets significant adoption, there would also be
demand for the new VM and
the compiler, and it would be easier (but still not exactly easy!)
to get contributors, maybe even development funding, etc.
The above plan would imply redesigning the type annotation syntax of mypy.
I've given it a lot of thought, and perhaps surprisingly, it seems that there
would not be need for many compromises. Generally readability would be similar
to the current syntax,
and sometimes it would be even better. I'm not going to cover this
in detail now, but the main difference would be the introduction of Python 3
style annotation syntax (obviously for Python 3.x only; Python 2.x needs a
different approach):
NOW:
str greeting(str name):
return 'hello, ' + name
NEW PROPOSAL:
def greeting(name:str) -> str:
return 'hello, ' + name
Mypy uses nominal subtyping, even though structural subtyping would help
model 'duck typing' in Python. Many people have expressed their interest in
structural subtyping, and I discussed this at PyCon as well.
Earlier, I thought that this couldn't be implemented
efficiently on platforms that I would eventually like to be able to support,
including Dalvik
(Android). However, now I think I've figured out how to have efficient
structural subtyping on basically any VM than could realistically run mypy,
so the main objection is thrown out. Also,
with the proposed Python-compatible syntax, structural subtyping could be
a win for various reasons. In summary, it now seems likely that mypy will
get support for structural subtyping in addition to nominal subtyping. I've
started to prepare an enhancement proposal.
There are other, less major changes that Python compatibility would
require. Mypy should support multiple inheritance without
the current limitations, similar to Python.
Again, I previously ruled this out due to efficiency concerns,
but I think I was wrong and there is really no technical reason why
multiple inheritance needs to be restricted to interfaces like it is now.
Also, mypy needs to support metaclasses; this one trickier but I'm optimistic
about it as well.
Let me know if you have any opinions on the proposed changes. Write
comments below or send me en
email.