This notebook contains a demonstration of new features present in the 0.41.0 release of Numba. Whilst release notes are produced as part of the CHANGE_LOG
, there's nothing like seeing code in action!
Included are demonstrations of:
First, import the necessary from Numba and NumPy...
from numba import njit, config, __version__
from numba.extending import overload
import numpy as np
assert tuple(int(x) for x in __version__.split('.')[:2]) >= (0, 41)
Initial support for Unicode strings has been implemented for Python versions >= 3.4. Support for fundamental string operations has been added as well as support for strings as arguments and return value. The next release of Numba will contain performance updates and additional features to further enhance string support.
if config.PYVERSION > (3, 4): # Only supported in Python >= 3.4
@njit
def strings_demo(str1, str2, str3):
# strings, ---^ ---^ ---^
# as arguments are now supported!
# defining strings in compiled code also works
def1 = 'numba is '
# as do unicode strings
def2 = '🐍⚡'
# also string concatenation
print(str1 + str2)
# comparison operations
print(str1 == str2)
print(str1 < str2)
print(str1 <= str2)
print(str1 > str2)
print(str1 >= str2)
# {starts,ends}with
print(str1.startswith(str3))
print(str2.endswith(str3))
# len()
print(len(str1), len(def2), len(str3))
# str.find()
print(str2.find(str3))
# in
print(str3 in str2)
# slicing
print(str2[1:], str1[:1])
# and finally, strings can also be returned
return '\nnum' + str1[1::-1] + def1[5:] + def2
# run the demo
print(strings_demo('abc', 'zba', 'a'))
The ParallelAccelerator
technology is used when the parallel=True
kwarg is supplied to @jit
. This technology is what transforms the decorated function into one that can run on multiple CPUs. Whilst parallel=True
has been implemented for some time, the optimizations taking place have not been exposed in a manner that is easy to understand. Numba 0.41.0 contains the first cut of a new diagnostics tool that aims to help demystify what ParallelAccelerator does internally as it transforms the function to run in parallel!
Documentation for this feature is available here, including an explanation of how to interpret the parallel diagnostics output.
from numba import prange # import parallel range
# decorate a function with `parallel=True` as usual
@njit(parallel=True)
def test(x):
n = x.shape[0]
a = np.sin(x) # parallel array expression
b = np.cos(a * a) # parallel array expression
acc = 0
for i in prange(n - 2): # user defined parallel loop
for j in prange(n - 1): # user defined parallel loop
acc += b[i] + b[j + 1] # parallel reduction
return acc
# run the function
test(np.arange(10))
# access the diagnostic output via the new `parallel_diagnostics` method on the dispatcher
test.parallel_diagnostics(level=4)
This release contains a number of newly supported NumPy functions:
tri
, tril
, triu
partition
, ediff1d
cov
nancumsum
, nancumprod
conj
, conjugate
@njit
def numpy_new():
# create some simple array data for use in np.tril and np.triu
a = np.arange(12.).reshape(3, 4)
print('Input array:')
print(a)
# try out np.tri, np.triu, np.tril
print('np.tri(3)')
print(np.tri(3))
print('np.tril(a)')
print(np.tril(a))
print('np.triu(a, k=1)')
print(np.triu(a, k=1))
# copy and shuffle the simple array data for use with np.partition, np.ediff1d and np.cov
a_unordered = a.copy()
np.random.seed(0)
np.random.shuffle(a_unordered.ravel())
print('\nInput array:')
print(a_unordered)
# try out np.partition, np.ediff1d and np.cov
print('np.partition(a_unordered, 0)')
print(np.partition(a_unordered, 0))
print('np.ediff1d(a_unordered)')
print(np.ediff1d(a_unordered))
print('np.cov(a_unordered)')
print(np.cov(a_unordered))
# create some data with NaN present to try out np.nancumsum and np.nancumprod
a_w_nan = a.copy()
a_w_nan.ravel()[::2] = np.nan
print('\nInput array:')
print(a_w_nan)
# try out np.nancumsum and np.nancumprod
print('np.nancumsum(a_w_nan)')
print(np.nancumsum(a_w_nan))
print('np.nancumprod(a_w_nan)')
print(np.nancumprod(a_w_nan))
# finally, create some data in the complex domain to try out np.conj and np.conjugate
a_cmplx = a.copy() + a_unordered.copy() * 1j
print('\nInput array:')
print(a_cmplx)
# try out np.conj and np.conjugate
print('np.conj(a_cmplx)')
print(np.conj(a_cmplx))
print('np.conjugate(a_cmplx)')
print(np.conjugate(a_cmplx))
numpy_new()
Numba 0.41.0 has a significant change made to the typing system that aims to clean up the use of constants. This change takes the form of support for type specific literal values in the type inference mechanism. During typing two passes are now made, the first with anything which is a constant and can expressed as a literal set as such (integers, strings, slices and make_function
are implemented as literals at present), the second with the standard types used for the constants. This, for example, permits value based dispatch as demonstrated below, but also opens up a lot of future possibilities surrounding typing which were inaccessible prior to this change.
from numba import generated_jit
@generated_jit
def myoverload(arg):
literal_val = getattr(arg, 'literal_value', None)
if literal_val is not None:
if literal_val == 100:
def impl_1(arg):
return 'dispatched: impl_1(literal, value 100)'
return impl_1
else:
def impl_2(arg):
return 'dispatched: impl_2(literal, value not 100)'
return impl_2
else:
def impl_3(arg):
return 'dispatched: impl_3(non-literal type)'
return impl_3
@njit
def example(x):
print(myoverload(100)) # literal value 100, dispatches impl_1
print(myoverload(99)) # literal value 99, dispatches impl_2
a = 50 + 25 + 2 * 10 + 15 // 3 # `a` is const expr value 100
print(myoverload(a)) # `a` has literal value 100, dispatches impl_1
b = 50 * x # `b` non-literal, it's an intp type
print(myoverload(b)) # `b` non-literal intp, has no value, dispatches impl_3
example(2)
Finally (and left to last as an exception is raised!), tracebacks from exceptions raised in jitted code now contain a synthesized stack frame containing the location where the exception was raised. The stack frame is based on the Python source from which is was compiled, it looks like a CPython traceback, but is coming from compiled code! This makes it easier to use exceptions in nopython
mode as it is now possible to find out the location from which they were raised. Try commenting/uncommenting the @njit
decorator and rerunning!
@njit
def raise_exception(x):
if x == 0:
raise Exception('raised x==0. Also, exception arguments are correctly handled', 123, 4j)
raise_exception(0)