This project is pretty old and unmaintained. The chain functionality is buggy, and i don't intend to fix it, as i have moved to python3.
The repo will be kept for reference.
AsyncKit is a micro-toolkit for doing thread-pooled async work in python (in your otherwise hacked togethe synchronous single-file script)
It is nothing fancy, but is really great for running a lot of IO-heavy work in parallel and return the results (Like scraping).
Most of the time you will only need to import the Pool class
Import the Pool:
from asynckit import Pool
Create a pool object with your desired number of workers:
my_pool = Pool(worker_count=4)
Add some work
Work in asynckit is any callable. Simple as that:
# our work is to download some url and return the result def download_url(url): return urllib2.urlopen(url).read() # we schedule the work by calling the Pools .do() method our_async_value = my_pool.do(download_url, 'http://www.python.org/') # to retrieve the result, call .get() on the async value # Note that we hand .get() the True argument. True means "wait for result forever" # you can also give it a float or an integer to wait that number of seconds for the result python_org_site = our_async_value.get(True) print len(python_org_site) # If we use a timeout, and the timeout expires, get() will return None # Also note that if our work (the download_url() function) throws an exception, # the exception is raised when calling .get() # So always remember to call .get() if you need the exception raised. # (you can check for exceptions by calling the .is_error() method in the AsyncValue)
In the above example we tell the pool
my_pool to call
download_url with the argument
You can add as many arguments as you like, and even keyword arguments like you would expect from any regular function call.
The return value of the
.do() method is an object of type
AsyncValue is a
threading.Event-ish object with the added bonus of containing a value (and exceptions if any).
When your work completes, the return value will be stored in the
ready for retrieval.
Before you retrieve your value, ensure that the work is completed by checking if the
AsyncValue object is set with the
You can wait for the result by calling the objects
.wait() method ( NOTE:
.wait() blocks the current thread! )
See http://docs.python.org/2/library/threading.html#event-objects for how to work
AsyncValue object as a
When you are ready to retrieve your value, call the
Alternatively you can call
.get() with a timeout in seconds to block and wait for the result.
This is identical to calling
.wait() just before calling
.get() returns the return value of your work, or raise an exception thrown inside your work.
If you passed a timeout in seconds to
.get() it will return
None if the timeout expired.
Joining multiple AsyncValues
Usually you want to perform n times work in parallel, and wait for all of it to complete.
This can be achieved with the
from asynckit import Pool, AsyncList import urllib2 # define our heavy work def download(url): return urllib2.urlopen(url).read() # create a pool pool = Pool(worker_count=2) # then schedule the heavy work on the pool result1 = pool.do(download, 'http://tudb.org') result2 = pool.do(download, 'http://github.com') result3 = pool.do(download, 'http://tudb.org') # then we create an AsyncList with our AsyncValues in it my_downloads = AsyncList([result1, result2, result3]) # The AsyncList is itself an AsyncValue, with is_set() and wait() # We could call .wait() on our list to wait for the results to complete # my_downloads.wait() # or we can simply tell the .get() method to wait by passing True as first argument # the .get() method returns a list of values stored in our AsyncValue results print [len(site) for site in my_downloads.get(True)]
In 0.4 the
.chain() methods was introduced, allowing you to chain work in a more natural way.
.chain() works just like a
.do() method, except it will wait until complete before scheduling
your chained work:
from asynckit import Pool import urllib2 def download(url): return urllib2.urlopen(url).read() pool = Pool() # the final result will only contain the return value of the _last_ chain call final_result = pool.do(download, 'http://tudb.org').chain(download, 'http://tudb.org') print len(final_result.get(True))