This module contains all the basic functions we need in other modules of the fastai library (split with torch_core
that contains the ones requiring pytorch). Its documentation can easily be skipped at a first read, unless you want to know what a given function does.
from fastai.gen_doc.nbdoc import *
from fastai.core import *
default_cpus = min(16, num_cpus())
show_doc(has_arg)
has_arg
[source][test]
has_arg
(func
,arg
) →bool
No tests found for has_arg
. To contribute a test please refer to this guide and this discussion.
Check if func
accepts arg
.
Examples for two fastai.core
functions. Docstring shown before calling has_arg
for reference
has_arg(download_url,'url')
True
has_arg(index_row,'x')
False
has_arg(index_row,'a')
True
show_doc(ifnone)
param,alt_param = None,5
ifnone(param,alt_param)
5
param,alt_param = None,[1,2,3]
ifnone(param,alt_param)
[1, 2, 3]
show_doc(is1d)
two_d_array = np.arange(12).reshape(6,2)
print( two_d_array )
print( is1d(two_d_array) )
[[ 0 1] [ 2 3] [ 4 5] [ 6 7] [ 8 9] [10 11]] False
is1d(two_d_array.flatten())
True
show_doc(is_listy)
Check if x
is a Collection
. Tuple
or List
qualify
some_data = [1,2,3]
is_listy(some_data)
True
some_data = (1,2,3)
is_listy(some_data)
True
some_data = 1024
print( is_listy(some_data) )
False
print( is_listy( [some_data] ) )
True
some_data = dict([('a',1),('b',2),('c',3)])
print( some_data )
print( some_data.keys() )
{'a': 1, 'b': 2, 'c': 3} dict_keys(['a', 'b', 'c'])
print( is_listy(some_data) )
print( is_listy(some_data.keys()) )
False False
print( is_listy(list(some_data.keys())) )
True
show_doc(is_tuple)
Check if x
is a tuple
.
print( is_tuple( [1,2,3] ) )
False
print( is_tuple( (1,2,3) ) )
True
show_doc(arange_of)
arange_of
[source][test]
arange_of
(x
)
No tests found for arange_of
. To contribute a test please refer to this guide and this discussion.
Same as range_of
but returns an array.
arange_of([5,6,7])
array([0, 1, 2])
type(arange_of([5,6,7]))
numpy.ndarray
show_doc(array)
array
[source][test]
array
(a
,dtype
:type
=*None
, ***kwargs
**) →ndarray
Tests found for array
:
Some other tests where array
is used:
pytest -sv tests/test_core.py::test_arrays_split
[source]pytest -sv tests/test_core.py::test_even_mults
[source]pytest -sv tests/test_core.py::test_idx_dict
[source]pytest -sv tests/test_core.py::test_is1d
[source]pytest -sv tests/test_core.py::test_itembase_eq
[source]pytest -sv tests/test_core.py::test_itembase_hash
[source]pytest -sv tests/test_core.py::test_one_hot
[source]pytest -sv tests/test_torch_core.py::test_model_type
[source]pytest -sv tests/test_torch_core.py::test_tensor_array_monkey_patch
[source]pytest -sv tests/test_torch_core.py::test_tensor_with_ndarray
[source]pytest -sv tests/test_torch_core.py::test_to_detach
[source]To run tests please refer to this guide.
Same as np.array
but also handles generators. kwargs
are passed to np.array
with dtype
.
array([1,2,3])
array([1, 2, 3])
Note that after we call the generator, we do not reset. So the array
call has 5 less entries than it would if we ran from the start of the generator.
def data_gen():
i = 100.01
while i<200:
yield i
i += 1.
ex_data_gen = data_gen()
for _ in range(5):
print(next(ex_data_gen))
100.01 101.01 102.01 103.01 104.01
array(ex_data_gen)
array([105.01, 106.01, 107.01, 108.01, ..., 196.01, 197.01, 198.01, 199.01])
ex_data_gen_int = data_gen()
array(ex_data_gen_int,dtype=int) #Cast output to int array
array([100, 101, 102, 103, ..., 196, 197, 198, 199])
show_doc(arrays_split)
data_a = np.arange(15)
data_b = np.arange(15)[::-1]
mask_a = (data_a > 10)
print(data_a)
print(data_b)
print(mask_a)
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14] [14 13 12 11 10 9 8 7 6 5 4 3 2 1 0] [False False False False False False False False False False False True True True True]
arrays_split(mask_a,data_a)
[(array([11, 12, 13, 14]),), (array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]),)]
np.vstack([data_a,data_b]).transpose().shape
(15, 2)
arrays_split(mask_a,np.vstack([data_a,data_b]).transpose()) #must match on dimension 0
[(array([[11, 3], [12, 2], [13, 1], [14, 0]]),), (array([[ 0, 14], [ 1, 13], [ 2, 12], [ 3, 11], [ 4, 10], [ 5, 9], [ 6, 8], [ 7, 7], [ 8, 6], [ 9, 5], [10, 4]]),)]
show_doc(chunks)
You can transform a Collection
into an Iterable
of 'n' sized chunks by calling chunks
:
data = [0,1,2,3,4,5,6,7,8,9]
for chunk in chunks(data, 2):
print(chunk)
[0, 1] [2, 3] [4, 5] [6, 7] [8, 9]
for chunk in chunks(data, 3):
print(chunk)
[0, 1, 2] [3, 4, 5] [6, 7, 8] [9]
show_doc(df_names_to_idx)
ex_df = pd.DataFrame.from_dict({"a":[1,1,1],"b":[2,2,2]})
print(ex_df)
a b 0 1 2 1 1 2 2 1 2
df_names_to_idx('b',ex_df)
[1]
show_doc(extract_kwargs)
extract_kwargs
[source][test]
extract_kwargs
(names
:StrList
,kwargs
:KWArgs
)
No tests found for extract_kwargs
. To contribute a test please refer to this guide and this discussion.
Extract the keys in names
from the kwargs
.
key_word_args = {"a":2,"some_list":[1,2,3],"param":'mean'}
key_word_args
{'a': 2, 'some_list': [1, 2, 3], 'param': 'mean'}
(extracted_val,remainder) = extract_kwargs(['param'],key_word_args)
print( extracted_val,remainder )
{'param': 'mean'} {'a': 2, 'some_list': [1, 2, 3]}
show_doc(idx_dict)
idx_dict(['a','b','c'])
{'a': 0, 'b': 1, 'c': 2}
show_doc(index_row)
index_row
[source][test]
index_row
(a
:Union
[Collection
[T_co
],DataFrame
,Series
],idxs
:Collection
[int
]) →Any
No tests found for index_row
. To contribute a test please refer to this guide and this discussion.
Return the slice of a
corresponding to idxs
.
a
is basically something you can index into like a dataframe, an array or a list.
data = [0,1,2,3,4,5,6,7,8,9]
index_row(data,4)
4
index_row(pd.Series(data),7)
7
data_df = pd.DataFrame([data[::-1],data]).transpose()
data_df
0 | 1 | |
---|---|---|
0 | 9 | 0 |
1 | 8 | 1 |
2 | 7 | 2 |
3 | 6 | 3 |
4 | 5 | 4 |
5 | 4 | 5 |
6 | 3 | 6 |
7 | 2 | 7 |
8 | 1 | 8 |
9 | 0 | 9 |
index_row(data_df,7)
0 2 1 7 Name: 7, dtype: int64
show_doc(listify)
to_match = np.arange(12)
listify('a',to_match)
['a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a']
listify('a',5)
['a', 'a', 'a', 'a', 'a']
listify(77.1,3)
[77.1, 77.1, 77.1]
listify( (1,2,3) )
[1, 2, 3]
listify((1,2,3),('a','b','c'))
[1, 2, 3]
show_doc(random_split)
Splitting is done here with random.uniform()
so you may not get the exact split percentage for small data sets
data = np.arange(20).reshape(10,2)
data.tolist()
[[0, 1], [2, 3], [4, 5], [6, 7], [8, 9], [10, 11], [12, 13], [14, 15], [16, 17], [18, 19]]
random_split(0.20,data.tolist())
[(array([[ 0, 1], [ 2, 3], [ 4, 5], [ 6, 7], [ 8, 9], [10, 11], [12, 13], [14, 15], [16, 17], [18, 19]]),), (array([], shape=(0, 2), dtype=int64),)]
random_split(0.20,pd.DataFrame(data))
[(array([[ 0, 1], [ 4, 5], [ 8, 9], [10, 11], [16, 17], [18, 19]]),), (array([[ 2, 3], [ 6, 7], [12, 13], [14, 15]]),)]
show_doc(range_of)
range_of
[source][test]
range_of
(x
)
No tests found for range_of
. To contribute a test please refer to this guide and this discussion.
Create a range from 0 to len(x)
.
range_of([5,4,3])
[0, 1, 2]
range_of(np.arange(10)[::-1])
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
show_doc(series2cat)
data_df = pd.DataFrame.from_dict({"a":[1,1,1,2,2,2],"b":['f','e','f','g','g','g']})
data_df
a | b | |
---|---|---|
0 | 1 | f |
1 | 1 | e |
2 | 1 | f |
3 | 2 | g |
4 | 2 | g |
5 | 2 | g |
data_df['b']
0 f 1 e 2 f 3 g 4 g 5 g Name: b, dtype: object
series2cat(data_df,'b')
data_df['b']
0 f 1 e 2 f 3 g 4 g 5 g Name: b, dtype: category Categories (3, object): [e < f < g]
series2cat(data_df,'a')
data_df['a']
0 1 1 1 2 1 3 2 4 2 5 2 Name: a, dtype: category Categories (2, int64): [1 < 2]
show_doc(split_kwargs_by_func)
split_kwargs_by_func
[source][test]
split_kwargs_by_func
(kwargs
,func
)
No tests found for split_kwargs_by_func
. To contribute a test please refer to this guide and this discussion.
Split kwargs
between those expected by func
and the others.
key_word_args = {'url':'http://fast.ai','dest':'./','new_var':[1,2,3],'testvalue':42}
split_kwargs_by_func(key_word_args,download_url)
({'url': 'http://fast.ai', 'dest': './'}, {'new_var': [1, 2, 3], 'testvalue': 42})
show_doc(to_int)
to_int(3.1415)
3
data = [1.2,3.4,7.25]
to_int(data)
[1, 3, 7]
show_doc(uniqueify)
uniqueify( pd.Series(data=['a','a','b','b','f','g']) )
['a', 'b', 'f', 'g']
show_doc(PrePostInitMeta)
class
PrePostInitMeta
[source][test]
PrePostInitMeta
(name
,bases
,dct
) ::type
No tests found for PrePostInitMeta
. To contribute a test please refer to this guide and this discussion.
A metaclass that calls optional __pre_init__
and __post_init__
methods
class _T(metaclass=PrePostInitMeta):
def __pre_init__(self): self.a = 0; assert self.a==0
def __init__(self): self.a += 1; assert self.a==1
def __post_init__(self): self.a += 1; assert self.a==2
t = _T()
t.a
2
show_doc(download_url)
show_doc(find_classes)
show_doc(join_path)
show_doc(join_paths)
show_doc(loadtxt_str)
loadtxt_str
[source][test]
loadtxt_str
(path
:PathOrStr
) →ndarray
No tests found for loadtxt_str
. To contribute a test please refer to this guide and this discussion.
Return ndarray
of str
of lines of text from path
.
show_doc(save_texts)
save_texts
[source][test]
save_texts
(fname
:PathOrStr
,texts
:StrList
)
No tests found for save_texts
. To contribute a test please refer to this guide and this discussion.
Save in fname
the content of texts
.
show_doc(num_cpus)
show_doc(parallel)
parallel
[source][test]
parallel
(func
,arr
:Collection
[T_co
],max_workers
:int
=*None
,leave
=False
*)
No tests found for parallel
. To contribute a test please refer to this guide and this discussion.
Call func
on every element of arr
in parallel using max_workers
.
func
must accept both the value and index of each arr
element.
def my_func(value, index):
print("Index: {}, Value: {}".format(index, value))
my_array = [i*2 for i in range(5)]
parallel(my_func, my_array, max_workers=3)
Index: 0, Value: 0 Index: 1, Value: 2 Index: 2, Value: 4 Index: 4, Value: 8 Index: 3, Value: 6
show_doc(partition)
show_doc(partition_by_cores)
partition_by_cores
[source][test]
partition_by_cores
(a
:Collection
[T_co
],n_cpus
:int
) →List
[Collection
[T_co
]]
No tests found for partition_by_cores
. To contribute a test please refer to this guide and this discussion.
Split data in a
equally among n_cpus
cores
show_doc(ItemBase, title_level=3)
All items used in fastai should subclass this. Must have a data
field that will be used when collating in mini-batches.
show_doc(ItemBase.apply_tfms)
apply_tfms
[source][test]
apply_tfms
(tfms
:Collection
[T_co
], ****kwargs
**)
No tests found for apply_tfms
. To contribute a test please refer to this guide and this discussion.
Subclass this method if you want to apply data augmentation with tfms
to this ItemBase
.
show_doc(ItemBase.show)
show
[source][test]
show
(ax
:Axes
, ****kwargs
**)
No tests found for show
. To contribute a test please refer to this guide and this discussion.
Subclass this method if you want to customize the way this ItemBase
is shown on ax
.
The default behavior is to set the string representation of this object as title of ax
.
show_doc(Category, title_level=3)
show_doc(EmptyLabel, title_level=3)
class
EmptyLabel
[source][test]
EmptyLabel
() ::ItemBase
No tests found for EmptyLabel
. To contribute a test please refer to this guide and this discussion.
Should be used for a dummy label.
show_doc(MultiCategory, title_level=3)
Create a MultiCategory
with an obj
that is a collection of labels. data
corresponds to the one-hot encoded labels and raw
is a list of associated string.
show_doc(FloatItem)
show_doc(camel2snake)
camel2snake('DeviceDataLoader')
'device_data_loader'
show_doc(even_mults)
In linear scales each element is equidistant from its neighbors:
# from 1 to 10 in 5 steps
t = np.linspace(1, 10, 5)
t
array([ 1. , 3.25, 5.5 , 7.75, 10. ])
for i in range(len(t) - 1):
print(t[i+1] - t[i])
2.25 2.25 2.25 2.25
In logarithmic scales, each element is a multiple of the previous entry:
t = even_mults(1, 10, 5)
t
array([ 1. , 1.778279, 3.162278, 5.623413, 10. ])
# notice how each number is a multiple of its predecessor
for i in range(len(t) - 1):
print(t[i+1] / t[i])
1.7782794100389228 1.7782794100389228 1.7782794100389228 1.7782794100389228
show_doc(func_args)
func_args
[source][test]
func_args
(func
) →bool
No tests found for func_args
. To contribute a test please refer to this guide and this discussion.
Return the arguments of func
.
func_args(download_url)
('url', 'dest', 'overwrite', 'pbar', 'show_progress', 'chunk_size', 'timeout', 'retries')
Additionally, func_args
can be used with functions that do not belong to the fastai library
func_args(np.linspace)
('start', 'stop', 'num', 'endpoint', 'retstep', 'dtype')
show_doc(noop)
Return x
.
# object is returned as-is
noop([1,2,3])
[1, 2, 3]
show_doc(one_hot)
One-hot encoding is a standard machine learning technique. Assume we are dealing with a 10-class classification problem and we are supplied a list of labels:
y = [1, 4, 4, 5, 7, 9, 2, 4, 0]
jekyll_note("""y is zero-indexed, therefore its first element (1) belongs to class 2, its second element (4) to class 5 and so on.""")
len(y)
9
y can equivalently be expressed as a matrix of 9 rows and 10 columns, where each row represents one element of the original y.
for label in y:
print(one_hot(label, 10))
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 1. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 1. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.] [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.] [1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
show_doc(show_some)
show_some
[source][test]
show_some
(items
:Collection
[T_co
],n_max
:int
=*5
,sep
:str
=','
*)
No tests found for show_some
. To contribute a test please refer to this guide and this discussion.
Return the representation of the first n_max
elements in items
.
# select 3 elements from a list
some_data = show_some([10, 20, 30, 40, 50], 3)
some_data
'10,20,30...'
type(some_data)
str
# the separator can be changed
some_data = show_some([10, 20, 30, 40, 50], 3, sep = '---')
some_data
'10---20---30...'
some_data[:-3]
'10---20---30'
show_some
can take as input any class with __len__ and __getitem__
class Any(object):
def __init__(self, data):
self.data = data
def __len__(self):
return len(self.data)
def __getitem__(self,i):
return self.data[i]
some_other_data = Any('nice')
show_some(some_other_data, 2)
'n,i...'
show_doc(subplots)
show_doc(text2html_table)
text2html_table
[source][test]
text2html_table
(items
:Tokens
) →str
No tests found for text2html_table
. To contribute a test please refer to this guide and this discussion.
Put the texts in items
in an HTML table, widths
are the widths of the columns in %.
show_doc(is_dict)