SMOPis Small Matlab and Octave to Python compiler.
SMOPtranslates matlab to python. Despite obvious similarities between matlab and numeric python, there are enough differences to make manual translation infeasible in real life.
SMOPgenerates human-readable python, which also appears to be faster than octave. Just how fast? Timing results for "Moving furniture" are shown in Table 1. It seems that for this program, translation to python resulted in about two times speedup, and additional two times speedup was achieved by compiling
runtime.pyto C, using cython. This pseudo-benchmark measures scalar performance, and my interpretation is that scalar computations are of less interest to the octave team.
- October 15, 2014
- Version 0.26.3 is available for beta testing.
Next version 0.27 is planned to compile octave
scriptslibrary, which contains over 120 KLOC in almost 1,000 matlab files. There are 13 compilation errors with smop 0.26.3 .
Network installation is the best method if you just want it to run the example:
$ easy_install smop --user
Install from the sources if you are behind a firewall:
$ tar zxvf smop.tar.gz $ cd smop $ python setup.py install --user
Fork github repository if you need the latest fixes.
Finally, it is possible to use smop without doing the installation, but only if you already installed the dependences -- numpy and networkx:
$ tar zxvf smop.tar.gz $ cd smop/smop $ python main.py solver.m $ python solver.py
We will translate
solver.m to present a sample of smop features. The
program was borrowed from the matlab programming competition in 2004 (Moving
Furniture).To the left is
solver.m. To the right is
a.py --- its
translation to python. Though only 30 lines long, this
example shows many of the complexities of converting matlab code
01 function mv = solver(ai,af,w) 01 def solver_(ai,af,w,nargout=1): 02 nBlocks = max(ai(:)); 02 nBlocks=max_(ai[:]) 03 [m,n] = size(ai); 03 m,n=size_(ai,nargout=2)
|02||Matlab uses round brackets both for array indexing and for function calls. To figure out which is which, SMOP computes local use-def information, and then applies the following rule: undefined names are functions, while defined are arrays.|
04 I = [0 1 0 -1]; 04 I=matlabarray([0,1,0,- 1]) 05 J = [1 0 -1 0]; 05 J=matlabarray([1,0,- 1,0]) 06 a = ai; 06 a=copy_(ai) 07 mv = ; 07 mv=matlabarray()
|04||Matlab array indexing starts with one; python indexing
starts with zero. New class
|06||Matlab array assignment implies copying; python assignment implies data sharing. We use explicit copy here.|
08 while ~isequal(af,a) 08 while not isequal_(af,a): 09 bid = ceil(rand*nBlocks); 09 bid=ceil_(rand_() * nBlocks) 10 [i,j] = find(a==bid); 10 i,j=find_(a == bid,nargout=2) 11 r = ceil(rand*4); 11 r=ceil_(rand_() * 4) 12 ni = i + I(r); 12 ni=i + I[r] 13 nj = j + J(r); 13 nj=j + J[r]
|09||Matlab functions of zero arguments, such as
|10||The expected number of return values from the matlab
|12||Variables I and J contain instances of the new class
14 if (ni<1) || (ni>m) || 14 if (ni < 1) or (ni > m) or (nj<1) || (nj>n) (nj < 1) or (nj > n): 15 continue 15 continue 16 end 16 17 if a(ni,nj)>0 17 if a[ni,nj] > 0: 18 continue 18 continue 19 end 19 20 [ti,tj] = find(af==bid); 20 ti,tj=find_(af == bid,nargout=2) 21 d = (ti-i)^2 + (tj-j)^2; 21 d=(ti - i) ** 2 + (tj - j) ** 2 22 dn = (ti-ni)^2 + (tj-nj)^2; 22 dn=(ti - ni) ** 2 + (tj - nj) ** 2 23 if (d<dn) && (rand>0.05) 23 if (d < dn) and (rand_() > 0.05): 24 continue 24 continue 25 end 25 26 a(ni,nj) = bid; 26 a[ni,nj]=bid 27 a(i,j) = 0; 27 a[i,j]=0 28 mv(end+1,[1 2]) = [bid r]; 28 mv[mv.shape + 1,[1,2]]=[bid,r] 29 end 29 30 30 return mv
- With less than five thousands lines of python code
SMOPdoes not pretend to compete with such polished products as matlab or octave. Yet, it is not a toy. There is an attempt to follow the original matlab semantics as close as possible. Matlab language definition (never published afaik) is full of dark corners, and
SMOPtries to follow matlab as precisely as possible.
- There is a price, too.
- The generated sources are matlabic, rather than pythonic, which means that library maintainers must be fluent in both languages, and the old development environment must be kept around.
- Should the generated program be pythonic or matlabic?
For example should array indexing start with zero (pythonic) or with one (matlabic)?
I beleive now that some matlabic accent is unavoidable in the generated python sources. Imagine matlab program is using regular expressions, matlab style. We are not going to translate them to python style, and that code will remain forever as a reminder of the program's matlab origin.
Another example. Matlab code opens a file; fopen returns -1 on error. Pythonic code would raise exception, but we are not going to do that. Instead, we will live with the accent, and smop takes this to the extreme --- the matlab program remains mostly unchanged.
It turns out that generating matlabic` allows for moving much of the project complexity out of the compiler (which is already complicated enough) and into the runtime library, where there is almost no interaction between the library parts.
- Which one is faster --- python or octave? I don't know.
- Doing reliable performance measurements is notoriously
hard, and is of low priority for me now. Instead, I wrote
a simple driver
go.pyand rewrote rand so that python and octave versions run the same code. Then I ran the above example on my laptop. The results are twice as fast for the python version. What does it mean? Probably nothing. YMMV.
ai = zeros(10,10); af = ai; ai(1,1)=2; ai(2,2)=3; ai(3,3)=4; ai(4,4)=5; ai(5,5)=1; af(9,9)=1; af(8,8)=2; af(7,7)=3; af(6,6)=4; af(10,10)=5; tic; mv = solver(ai,af,0); toc
Running the test suite:
$ cd smop $ make check $ make test
lei@dilbert ~/smop-github/smop $ python main.py -h SMOP compiler version 0.25.1 Usage: smop [options] file-list Options: -V --version -X --exclude=FILES Ignore files listed in comma-separated list FILES -d --dot=REGEX For functions whose names match REGEX, save debugging information in "dot" format (see www.graphviz.org). You need an installation of graphviz to use --dot option. Use "dot" utility to create a pdf file. For example: $ python main.py fastsolver.m -d "solver|cbest" $ dot -Tpdf -o resolve_solver.pdf resolve_solver.dot -h --help -o --output=FILENAME By default create file named a.py -o- --output=- Use standard output -s --strict Stop on the first error -v --verbose