Son değiştirilme tarihi:21.09.2024

ÖNSÖZ

NOT:Dokümanda zaman zaman güncellemeler olacağı için arada bir güncel versiyon kontrolü yapmanızı tavsiye ederim.

"nbviewer" üzerinden görüntülüyorsanız, dosyayı kaydetmek için sayfanın sağ üst köşesindeki download butonuna tıklayın, açılan sayfada herhangi bir yere sağ tıklayın ve farklı kaydet diyerek dosyayı istediğiniz klasöre kaydedin. Dosyayı açabilmek için Jupyter'in kurulması gerekmekte olup, yeni başlayan biriyseniz aşağıda kurulumla ilgili detayları inceleyin.

Kurulumlar¶

EDIT(09.2024): Ben artık sadece Colab kullanıyorum. Yapay zeka desteği ile tartışılmaz tek gerekli araç bu olmuştur. İstediğiniz işi basit promptlarla yaptırmak çok kolay.

Ekran Görüntüsü - 2024-09-21 18-32-19.png

Oluştur'a tıkladıktan sonra, promptumuz yazıyorum ve voila!

Ekran Görüntüsü - 2024-09-21 18-33-34.png

Colab kullanma imkanı olmayanlar için aşağıdaki eski notlarım geçerlidir.

Python'a ilk kez başlıyorsanız önce "Anaconda" kurulumu yapmanız gerekmektedir. Bu işlemi şu sayfadan öğrenebilir veya hemen alttaki videodan izleyebilirsiniz.

NOT: Normalde Anaconda birçok paketi ve programı baırndırdığı için bilgisayarınızda çok yer kaplamakta olup ilerde anacondayı kaldırıp pure python kurulumu yapmanızı, ve sadece ihtyiacınız olan paketleri kurmanızı tavsiye ederim. Ancak herşeyin hazır gelmesi sebebiyle ilk başta anaconda ile başlamanız yerinde oalcaktır. Disk sorununuz yoksa anaconda ile devam da edebilirsiniz.

Anaconda ile birlikte Python environment'ına ek olarak Jupyter notebook uygulaması ve Spyder gibi IDE'ler de kurulacaktır. Ayrıca birçok önemli kütüphane(numpy, pandas, sklearn gibi) kurulmuş olacak. Ben de bu dokümanı Jupyter üzerinde hazırladım. O yüzden Jupyter ağırlıklı gideceğiz. Spyder'ı, kurcalayarak ve Youtube'da birkaç video izleyerek kendiniz de öğrenebilirsiniz.

Bu notebookta detaylı Jupyter kullanımı olmayacak. Her detayı vermekten ziyade bir başucu rehberi hazırlamayı amaçladım. Yani tam bir eğitim dokümanından ziyade büyük bir cheatsheet(hızlı başvuru kaynağı) tadında bir doküman bulacaksınız. Dokümanın en altında süper konsantre bir cheatsheet daha bulacaksınız.

Gereksiz açıklamalarla dokümanı şişirmek istemedim. Çoğu durumda kodları çalıştırdığınızda neyin ne olduğunu anlayabileceksiniz. Anlaşılması zor durumlar için ilave açıklamalar olacaktır.

Bununla birlikte bir nebze de olsa programcılık dünyasına aşina olmanızda fayda var. Eğer tamamen sıfır noktasındaysanız bu doküman size biraz ağır gelebilir. Önce başka bir kaynaktan temelleri(değişken, fonksiyon, algoritma, nesne v.s) öğrenin, bu dokümanı ise cheatsheet olarak kullanın.

In [1]:

from IPython.display import YouTubeVideo
YouTubeVideo('JEv5oigBUL0')

Out[1]:

Jupyter¶

Rehber¶

Jupyter kullanırken neleri bilmenizde fayda var?¶

Kısayol tuşları(Jupyter'de H tuşuna basınca çıkar)-->bunları mutlaka kullanın, büyük hız kazandırır. Herşeyi yukarıdaki menülerden yapmaya kalkarsanız yavaş ilerlersiniz.
magic functions(Googlelayın)-->Bir süre sonra, şimdi değil
smart suggestions(tab, tab+tab)-->sınıflar ve metodlar hakkında bilgi alırsınız, kod pratiğine başlayınca deneyin. Aşağıda print için örnek ekran görüntüleri var.
nbextensions ve jupyter_helpers-->sol paneldeki içindekiler ve indeksleme dahil birçok güzellik
help, dir ile yardım alınır-->deneyin(aşağıda örnek var)
type fonksiyonu ile sık sık bir değişkenin tipini öğrenmeniz gerekecek -->deneyin
naming convention(pep 0008)-->Kritik değil ama bence önemli, googlelayın
jupyteri ektin kullanma rehberleri(medium v.s)-->Hemen değil ama bir süre sonra googlelayın
https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/
https://towardsdatascience.com/productivity-tips-for-jupyter-python-a3614d70c770

In [2]:

a=1
liste=[]
print(type(a),type(liste))

<class 'int'> <class 'list'>

In [3]:

help(print) #print fonksiyonu hakkında yardım

Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.

In [4]:

[x for x in dir(dict) if not "__" in x] #dictionary nesnesinin property ve metodları. tek başına dir(dict) yapsaydım
#çok uzun bi liste çıkardı. Biraz aşağıda göreceğimiz 'list comprehension' yönteminden faydalanarak listeyi daralttım

Out[4]:

['clear',
 'copy',
 'fromkeys',
 'get',
 'items',
 'keys',
 'pop',
 'popitem',
 'setdefault',
 'update',
 'values']

Anacondayı kurduktan sonra yukarda bahsettiğim nbextensionsı kurar ve bu notebooku kendi pcnize indirip jupyter içinden açarsanız aşağıdaki gibi İçindekiler tablolu ve indeksli şekilde görebilirsiniz. Kurulumun nasıl yapılacağını https://jupyter-contrib-nbextensions.readthedocs.io/en/latest/install.html sayfasından görebilirsiniz. Onun öncesinde biraz aşağıdaki Modül, Package, Class bölümüne bakıp kavramlar hakkında kısa bir bilgi edinebilirsiniz. (Bu işlem size karışık gelirse bu aşamayı pas geçebilir ve biraz daha deneyim kazandıktan sonra tekrar denersiniz. Ancak benim tavsiyem, bikaç kez deneyin ve kurmaya çalışın, inanın çok faydasını göreceksiniz)

NOT: Yukarıdaki ekran görüntüsü gibi resimleri kolayca notebookunuz içine ekleyebiliyorsunuz. Sadece bir Markdown tipli hücre açıp içine girin ve clipboarda aldığınız resmi yapıştırın.

Çoklu output¶

Normalde bir hücrede "print" ifadesi kullanmazsak sadece son değişken çıktı olarak gösterilir. Ancak aşağıdaki kod bloğu ile tüm değişkenler çıktı olarak elde edilebilmektedir.

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

Örneğin bu interactive kodları girilmeden aşağıdaki kod çalışıtırılırsa sadece 2 sonucunu alırken

a=1
b=2
a
b

Yukarıdaki iki satırlık kod girilirse hem 1 hem 2 sonucu görünür.

In [5]:

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

In [6]:

a=1
b=2
a
b

Out[6]:

Çeşitli operatörler¶

In [7]:

#! işareti ile işletim sistemi komutları kullanılabilir. Windows'ta cmd'den, Linux'ta terminalden yazar gibi olur
# !dir #windowsta olsaydım
!ls #colab'te olduğum için, colabde Linux üzerinde çalıştığı için ls

drive  sample_data

In [8]:

!pwd

/content

In [9]:

%cd sample_data #! ile dğeil % ile. sebebi: The !cd command only changes the directory for that specific line. To change the directory for the whole notebook, use the %cd command instead.
!ls

[Errno 2] No such file or directory: 'sample_data #! ile dğeil % ile. sebebi: The !cd command only changes the directory for that specific line. To change the directory for the whole notebook, use the %cd command instead.'
/content
drive  sample_data

In [10]:

# "#" işareti ile yorum yazarız

In [11]:

#çeşitli dillerden metin yazılabilir, html, javascript v.s. Bunun için ilgili dil için geçerli olan magic command kullanılır
%%HTML
Farklı dil seçeneklerin <em>kullanabiliyoruz</em>. Burada HTML kullanmış olduk.

Farklı dil seçeneklerin kullanabiliyoruz. Burada HTML kullanmış olduk.

Matematiksel ifadeler¶

In [12]:

%%latex #Latex ile matematiksel formüller girebiliyoruz
$$E=mc^2$$
$sin(x)/x$

$$E=mc^2$$ $sin(x)/x$

Hizalama
Yukarıdaki code hücresiyidi, bu ise markdown hücresidir
$$ ile yazım formülü ortalar
$ ile yazımda ise sola dayalı ve küçük

$$\Pi p(n)$$$$\Pi \pi$$

$\frac \pi 2$

Diğer ifadeler

\frac ifadesi ile kesirler
_ karakteri ile indis(subscript)
\pi, \sum, \bar ve \hat gibi özel ifadeler

$$\sum p(n)$$$$y_i, \bar{y}_i, \hat{y}_i$$$$TSS=\sum_{i=1}^n(y_i-\bar{y}_i)^2$$$$P(c|x)=\frac {P(x|c).P(c)}{P(x)}$$

In [13]:

#bu modül ile de daha gösterişli matematiksel formüller girebiliyoruz
from IPython.display import Math
Math(r'F(k) = \int_{-\infty}^{\infty} f(x) e^{2\pi i k} dx')

Out[13]:

$\displaystyle F(k) = \int_{-\infty}^{\infty} f(x) e^{2\pi i k} dx$

In [14]:

#dahası da var, müzik ve video da ekleyebiliyoruz
from IPython.display import Audio
Audio(url="http://www.nch.com.au/acm/8k16bitpcm.wav")

Out[14]:

In [15]:

from IPython.display import YouTubeVideo
YouTubeVideo('agj3AxNPDWU')

Out[15]:

Latex ve Ipython.display hakkında daha fazla bilgi için

Genel syntax¶

In [16]:

sayi=1 #değişkenler doğrudan atanır. türkçe karakter kullanmamaya çalışın

Değişkenlerde çoklu satır kullanımı¶

In [17]:

linecont="merhaba burada line "+ \
" contination uygulandı" + \
"ama sonuç yine de bitişik yazar"
print(linecont)

merhaba burada line  contination uygulandıama sonuç yine de bitişik yazar

In [18]:

satırayaygın="""
bu satırlar
ise satırlara
yayılmış durum
"""
print(satırayaygın)

bu satırlar
ise satırlara
yayılmış durum

In [19]:

#veri yapıları için ise yukarıdaki iki yönteme de gerek olmadan kaydırabiliriz
gunler=["Pzt","salı",
       "çar","perş"]
print(gunler)

['Pzt', 'salı', 'çar', 'perş']

Başlıklar¶

Başlıklar normalde #, ##, ###,... ifadeleriyle HTML'deki h1,h2,h3...'e denk gelecek şekilde oluşturulur. Bu dokümandaki tüm başlıklar da böyle oluşuturuldu. Dokümanı kendi jupyterinizde açtıysanız, herhangi bir başlık hücresine gelip Enter'a basın. Enter ile hücreyi edit moduna sokmuş oluruz. Böylece başlığın nasıl yazıldığını da görmüş olursunuz. Mesela bu paragrafın başlığı aşağıdaki gibidir

Bu tür başlıklar, eğer nbextension kurduysanız 1.1.2 gibi indeksli şekilde görünür. Eğerki bu dokümana bir notebook gösterici(github veya nbviewer gibi) üzerinden bakıyorsanız 1.1.2 şeklindeki gösterimi görmüyorsunuzdur. Detaylar için en baştaki Rehber kısmına bakınız

Bu ise html(strong tag'i) kullanılarak oluşturulmuş kalın bi başlık. Sol paneldeki "İçindekiler" paneline girmesini istemediğiniz başlıkları bu şekilde oluşturabilirsiniz.

Diğer başlık türleri

*üç yıldız ile bold&italik
iki yıldız ile bold yapma
tek yıldız ile italik yapma*
sağa dönük tek tırnak ile vurgulu yapma. Bu karakteri Alt+96 ile yazabilirsiniz.

Paragraf, satır geçme ve html kullanımı¶

Üst paragraf

İki kere entera basarak paragraf açabilirsiniz(bu satırda olduğu gibi. hücreye çift tıklayın ve görün)
veya bi satırın sonunda "br" tagi ekleyerek bir alt satıra geçebilrsiniz. (bu satırda olduğu gibi. hücreye çift tıklayın ve görün)

NOT:Bu kolaylıkları öğrenene/keşfedene kadar ben bu paragraf ve bi alt satır işlerini Markdown değil de Raw NB Convert hücre tipi ile yapıyordum. Bunu öğrendikten sonra Raw NB Covnert tipli hücrelere pek ihtiyaç duymadım.

Naming convention¶

"Pep 008 python" araması yapın ve tüm detayları görün. aşağıda sadece "_" kullanımını koydum

In [20]:

_privatevariable=3
#_privatemethod()
list_=[1,2,3] #rezerv keylerin sonuna _ gelir. list diye bir değişken adı kullanamayız, list_ olabilir
dict_={"ad":"ali","soyad":"yılmaz"}
for x,_ in dict_.items(): #ilgilenmediğimiz değerler için "_"
    print(x)

ad
soyad

Bununu dışında sklearn(Machine Learning kütüphanesi)de bazı propertylerin _ ile bittiğini görürsünüz. bunların anlamı da, ilgili propertyle ulaşmak için öncelikle modelin eğitilmesi(fit edilmesi) gerekmekte, eğitilmemiş modelde bu bilgiye ulaşamak anlamsızdır demek.

Modül, Package, Class¶

Bu 3 kavram hiyerarşik olarak şöyle sıralanır. Package>module>class.

Yani her sınıf bir modül içindedir. modüller py uzantılı dosylardır. birkaç modül biraraya gelerek bir paket oluşturur. Ör: DataScience çalışmalarında pandas paketi kullanılır, bu anaconda sürümü ile birlikte gelir.

Bir de kütüphane kavramı var. Python'daki kütüphane(library) kavramı C/C# gibi dillerdeki dll dosyalarından farklı bir anlama sahiptir. Burada daha çok belirli modüllerin veya package'ların biraraya gelerek kavramsal bir topluluk oluşturumasından bahsediyoruz. Ör:Makine öğrenme kütüphanesi gibi. Bu arada ille bir fiziksel karşılık aranacaksa package'lar gibi düşünülebilirler.

DİKKAT:Kafa karışıtırıcı bir paragraf, isterseniz şimdilik atlayın ma sonra mutlaka glin ve özellikle alttaki linki okuyun

Yeni bir paket kurmak istediğinizde;

conda install paketadı demeniz yeterlidir. Eğer bu yeterli gelmezse;
pip install paketadı diyebilirsiniz. Bunu bazen başında ! olacak şekilde yapmak gerekebiliyor. ! varsa aslında komut satırı komutu gibi çalışmış oluyor, ! yoksa da % işareti varmış gibi çalışıyor. % işareti olması automagic olarak çalışması anlmamına gleiyor. automagic konusunu araştırrsanız anlarsınız.

Ama şimdi bu yukarıdaki iki yöntemi de unutun ve aşağıdaki şu linke bakın. kurulumları nasıl yapmanız gerektiğini göreceksiniz.

Daha detaylı bilgiyi aşağıdaki linkten edinebilirsiniz:
https://jakevdp.github.io/blog/2017/12/05/installing-python-packages-from-jupyter/

In [21]:

#uzun uzun kurulum bilgisi yazmasın diye -q ekleriz
!pip -q install forbiddenfruit

In [22]:

#örnek olarak şimdi DeepLearning paketi olan kerası kuruyorum
#conda install keras

In [23]:

#mevcut packageların listesi
!pip list | head -n 10 #ilk 10u gelsin diye sınırladım(| ve sonrası Linux komutudur, windows için more veya less olması lazım)

Package                          Version
-------------------------------- ---------------------
absl-py                          1.4.0
accelerate                       0.34.2
aiohappyeyeballs                 2.4.0
aiohttp                          3.10.5
aiosignal                        1.3.1
alabaster                        0.7.16
albucore                         0.0.16
albumentations                   1.4.15
ERROR: Pipe to stdout was broken
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>
BrokenPipeError: [Errno 32] Broken pipe

In [24]:

#bi paket hakkında bilgi
!pip show keras

Name: keras
Version: 3.4.1
Summary: Multi-backend Keras.
Home-page: https://github.com/keras-team/keras
Author: Keras team
Author-email: keras-users@googlegroups.com
License: Apache License 2.0
Location: /usr/local/lib/python3.10/dist-packages
Requires: absl-py, h5py, ml-dtypes, namex, numpy, optree, packaging, rich
Required-by: tensorflow

In [25]:

#eğer, daha üst sürümü varsa ona upgrade etmek için
!pip install numpy --upgrade
#veya conda update numpy

Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (2.1.1)

In [26]:

#yeni sürümle çalışıtırma sıkıntısı yaşarsanız eski sürüme dönebilirsiniz
#!pip install --upgrade paketadı=versiyonno #(Ör:pip install --upgrade werkzeug==0.12.2)

In [27]:

#paketi komple kaldırmak için
#!pip uninstall paketadı

Bu komut, sadece paketi kaldırır, ama bi paket kurlurken birçok dependency ile kurlur, yani o paketin çalışması için gerekli olan başka paketler. Bunların bazılarını başka paketler de kullanıyor olabilir, bazılarını ise sadece bu kaldırmak istediğinzi paket kullanıyordur, işte bu son grup için de kaldırma işlemini tek tek yapmanız gerekir, ama neyseki bunun için da başka bir paket var, bunu kurup aşağıdaki gibi çalışıtırırsanız, kaldırmak istediğnzi paket ve onunla gelen gereksiz paketler de kaldırılır. şurada detaylı bilgi var.

In [28]:

!pip install pip-autoremove

Requirement already satisfied: pip-autoremove in /usr/local/lib/python3.10/dist-packages (0.10.0)
Requirement already satisfied: pip in /usr/local/lib/python3.10/dist-packages (from pip-autoremove) (24.1.2)
Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from pip-autoremove) (71.0.4)

In [29]:

!pip-autoremove somepackage -y  #remove "somepackage" plus its dependencies:

somepackage is not an installed pip module, skipping
numpy 2.1.1 is installed but numpy<2.0a0,>=1.23 is required
Redoing requirement with just package name...
numpy 2.1.1 is installed but numpy<1.27,>=1.20 is required
Redoing requirement with just package name...
numpy 2.1.1 is installed but numpy<2.0,>=1.18.5 is required
Redoing requirement with just package name...
numpy 2.1.1 is installed but numpy<2,>=1 is required
Redoing requirement with just package name...
The 'jedi>=0.16' distribution was not found and is required by the application
Skipping jedi
numpy 2.1.1 is installed but numpy<2.1,>=1.22 is required
Redoing requirement with just package name...
numpy 2.1.1 is installed but numpy<2,>=1.22.4; python_version < "3.11" is required
Redoing requirement with just package name...
numpy 2.1.1 is installed but numpy<2,>=1.17.0 is required
Redoing requirement with just package name...
numpy 2.1.1 is installed but numpy<2.0a0,>=1.23 is required
Redoing requirement with just package name...
numpy 2.1.1 is installed but numpy<2.0,>=1.17.3 is required
Redoing requirement with just package name...
numpy 2.1.1 is installed but numpy<2.0.0,>=1.23.5; python_version <= "3.11" is required
Redoing requirement with just package name...
numpy 2.1.1 is installed but numpy<2.0.0,>=1.19.0; python_version >= "3.9" is required
Redoing requirement with just package name...
The 'pycairo>=1.16.0' distribution was not found and is required by the application
Skipping pycairo

Bu arada bu dependencyler nedir görmek isterseniz, şu komutu çalıştırın.

In [30]:

!pip freeze | grep -i pandas

geopandas==1.0.1
pandas==2.1.4
pandas-datareader==0.10.0
pandas-gbq==0.23.1
pandas-stubs==2.1.4.231227
sklearn-pandas==2.2.0

In [31]:

#tüm outofdate paktleri görmek için, çok uzun sürebiliyor
# !pip list --outdated

ERROR: Operation cancelled by user
^C

In [32]:

#python versiyonu öğrenmek
!python --version

Python 3.10.12

In [33]:

#python'ın versiyonunu yükseltmek
#!conda update python

In [34]:

#packageler hangi klasörlere kuruluyor
import sys
sys.path

Out[34]:

['/content',
 '/env/python',
 '/usr/lib/python310.zip',
 '/usr/lib/python3.10',
 '/usr/lib/python3.10/lib-dynload',
 '',
 '/usr/local/lib/python3.10/dist-packages',
 '/usr/lib/python3/dist-packages',
 '/usr/local/lib/python3.10/dist-packages/IPython/extensions',
 '/usr/local/lib/python3.10/dist-packages/setuptools/_vendor',
 '/root/.ipython']

In [35]:

import site
print(site.getsitepackages())

['/usr/local/lib/python3.10/dist-packages', '/usr/lib/python3/dist-packages', '/usr/lib/python3.10/dist-packages']

Özel kurulum şekillleri

In IPython (jupyter) 7.3 and later, there is a magic %pip and %conda command that will install into the current kernel (rather than into the instance of Python that launched the notebook).

%pip install geocoder

! pip install --user (The ! tells the notebook to execute the cell as a shell command.)

Modül ve sınıflar(ve hatta fonksiyonları) kodumuza dahil etme¶

Modül referansı: import x, kullanımı: x'i takipeden üye şeklinde. x.falanmetod, x.falanproperty, x.falanfalan
Modüldeki herşeyi dahil etme: from x import * , kullanımı: falanca(...)
Tek birşeyi dahil etme: from x import falanca. falanca doğrudan kullanılabilir, x.falanca demeye gerek yok(üsstekinden farkı daha az şey importladık)

NOT: Performans açısından mümkün mertebe az şey import etmeye çalışın.

In [36]:

from math import sqrt #math modülünden sqrt fonksiyonu
kök=sqrt(16)
kök

Out[36]:

4.0

In [37]:

from os import * # os modülündeki herşeyi
mkdir("test") #os dememize gerek yok. test diye bi klasör yarattık
removedirs("test") #hemen arkadan bu klasörü sildik
!dir

drive  sample_data

Kendi modüllerinizi import etme¶

Zaman geçtiktçe, bazı işleri sık yaptığınızı farkedeceksiniz ve bunları(sınıflar, fonkisyonlar) kendinize ait bir modülde toplayacaksınız. Sonrasında bunu normal bir modül import eder gibi ederiz.

In [38]:

# mypythonutility.py isminde bir dosyanız olduğunuzu düşünrsek
# import mypythonutility

Ancak bazen, kodlarımızda sık güncelleme yapmak durumunda kalabiliyoruz. o sırada da bu modülümü import ettiğimiz bir başka notebookta çalışırken güncel halini dikkate almasını isteriz. Bu işi, notebooku restart etmeden yapmanın bir yolu var:

In [39]:

%load_ext autoreload
%autoreload 2

Kendi modülünüzü paket gibi kullanma¶

Burayı daha ileride okuyun, şimdilik aklınızda bulunması için ve konu bütünlüğü adına burda olması daha iyi diye düşündüm. Yoksa bi üstteki maddeyi bilmeniz şimdilik yeterli.

Yazdığınız modülü çağırmak istediğinizde ya onunla aynı klasörde olmanız ya da os.chdir() yaparak ilgili klasöre konumlanmanız gerekir. Sürekli bununla uğraşmamak için modülünüzü bir package haline getirmeniz faydalı olacaktır.

Bunun için bir klasör yaratın ve içine bu py dosyanızı koyun. Bu klasöre bir de içi boş bir '__init__.py' dosyası koyun. Sonra bu klasörü tüm python paketlerinin olduğu klasöre(site-packages) taşıyın.

Eğer ki jupyterhub gibi yetkilerinizin sınırlı olduğ bir ortamda çalışıyorsanız, ve site-packages'ta klasöre açma yetkiniz yoksa, kendinize ayrılmış alanda bu klasörü oluşturup sistem pathine ekleyin. Bunu da aşağıdaki komutla yapabilirsiniz.

In [40]:

# import sys
# sys.path += "klasörün konumu"
# print(sys.path) #path'e eklenmiş mi görmek için bunu da yazalım

colabde işler biraz daha fakrlı tabi, önce colab'e drive bağlanması için izin vermeniz gerekir. bunun için aşağıdaki hücre çalışıtırılrve izinler verilir

In [41]:

#Önce bu
from google.colab import drive
drive.mount("/content/drive/")

Drive already mounted at /content/drive/; to attempt to forcibly remount, call drive.mount("/content/drive/", force_remount=True).

In [42]:

#sonra da bu
import sys
sys.path.insert(0,'/content/drive/MyDrive/Programming/PythonRocks/') #mypyext isimli pkaet bu folderda

In [43]:

#artık kendi paketimi kullanabilirim
import mypyext

Virtual Environment¶

https://realpython.com/python-virtual-environments-a-primer/ sayfasında güzel anlatılmış.

In [44]:

#En güncel pip'i kuralım
!py -m pip install --upgrade pip

/bin/bash: line 1: py: command not found

In [45]:

#înstalling
!py -m pip install --user virtualenv

/bin/bash: line 1: py: command not found

In [46]:

#yaratma
!py -m venv env #bulunduğumuz aktif klasör içinde yaratır

/bin/bash: line 1: py: command not found

In [47]:

#aktivasyon
!.\env\Scripts\activate

/bin/bash: line 1: .envScriptsactivate: command not found

In [48]:

#bakalım PATH'e eklenmiş mi
p=!PATH
str(p).split(";")

Out[48]:

["['/bin/bash: line 1: PATH: command not found']"]

In [49]:

!where python

/bin/bash: line 1: where: command not found

In [50]:

!deactivate

/bin/bash: line 1: deactivate: command not found

Veri Tipleri¶

In [51]:

i=1
f=1.0
s="merhaba"
#bu bir yorum
"""
merhaba
Bu 3 tırnak ifadesi fonksiyonların docstringi amaçlı kullanılır
Ankca çok satıra yayılan yorumlar için de kullanılabilir
"""
print(type(i))
print(type(f))
print(type(s))

Out[51]:

'\nmerhaba\nBu 3 tırnak ifadesi fonksiyonların docstringi amaçlı kullanılır\nAnkca çok satıra yayılan yorumlar için de kullanılabilir\n'

<class 'int'>
<class 'float'>
<class 'str'>

In [52]:

#Legal variable names:
myvar = "John"
my_var = "John"
_my_var = "John"
myVar = "John"
MYVAR = "John"
myvar2 = "John"

In [53]:

#Illegal variable names:
# 2myvar = "John"
# my-var = "John"
# my var = "John"

In [54]:

#çoklu değer atama
x, y, z = "Orange", "Banana", "Cherry"
print(x)
print(y)
print(z)

Orange
Banana
Cherry

In [55]:

#exponential
x = 1e9
y = 123E9
x
y

Out[55]:

1000000000.0

Out[55]:

123000000000.0

Veri tipi dönüştürme¶

In [56]:

y=float(1)
z=int(2.8)
s=str(y)

y,type(y)
z,type(z)
s,type(s)
print(s,type(s))

Out[56]:

(1.0, float)

Out[56]:

(2, int)

Out[56]:

('1.0', str)

1.0 <class 'str'>

In [57]:

z="1"
y=int(z)
type(y)

Out[57]:

int

In [58]:

x=1
print(isinstance(x,int))

True

Fonksiyonlar¶

Klasik fonksiyon¶

Genel olarak fonksiyonların ne olduğunu başka bir dokümandan öğrenmiş olmanız beklenimektedir. Başta da belirttiğim gibi bu dokümanın amacı büyük bir cheatsheet sağlamak.

In [59]:

def islem_yapan_parametresiz_fonksiyon(): #c tabanlı dillerdeki void dönüş tipi
    print("selam")

islem_yapan_parametresiz_fonksiyon()

selam

In [60]:

def sonuc_donduren_ve_parametre_almis_fonksiyon(karesi_alinacak_sayi):
    """
    bu fonksiyon kendisine gelen sayının karesini döndürür
    Args:
        karesi_alinacak_sayi: sayı
    """
    return karesi_alinacak_sayi**2

sonuc=sonuc_donduren_ve_parametre_almis_fonksiyon(4) #dönen sonucu bi dğeişkene atıyorum
print(sonuc)

In [61]:

#pythonda fonksiyonlar bazı dillerdeki durumun aksine çok değer döndürülebilir.
def cokdegerdondur(sayı):
    return sayı,sayı*10,sayı*100

kendi,onkat,yuzkat=cokdegerdondur(5)
print(onkat)
print(kendi)

50
5

Paramarray ile esnek sayıda parametre kullanımı ve default değer kavramı¶

Parametreler belirlenmiş sayıda olmak zorunda değil. Esnek sayıda parametre alma imkanı var.

In [62]:

def SayılarıToplaXeBöl(arg1,*args): #arg1 olmak zorunda değil, ama olacaksa paramarrayden önce olmalı
    """
    Bu fonksiyon ilk parametreden sonrakileri toplayıp ilk parametreye böler
    """
    Toplam=0
    for a in args:
        Toplam+=a
    return Toplam/arg1

x=SayılarıToplaXeBöl(10,1,2,3,4,5,6,7,8,9,10) #parametreler hardcoded yazıldıysa "*" yazmıyoruz.
print(x)
y=SayılarıToplaXeBöl(10,*range(1,11)) #parametreler bir fonksiyon ile dönüyorsa veya bir değişkense "*" var
print(y)

5.5
5.5

In [63]:

def dictparametreli(**kwargs): #** olursa parametre olarak dictionary veya daha genel olarak keyworded arguments alır
    for k,v in kwargs.items():
        print(k,v)

dict_={}
dict_["ad"]="Volkan"
dict_["soyad"]="Yurtseven"
dictparametreli(**dict_) #değişken şeklinde olduğu için ** ile
#veya
dictparametreli(ad="Volkan",soyad="Yurtseven")

ad Volkan
soyad Yurtseven
ad Volkan
soyad Yurtseven

Bazı parametreler default değerleriyle yazılırlar. Fonksiyona ilgili değer geçirilimezse bu default değer kullanılır

In [64]:

def opsiyonelli(adet,min_=1, max_=10):
    print(adet,min_,max_)

opsiyonelli(100,5) #son parametre 10 geçer

100 5 10

Lambda ve anonymous function¶

Lambda ifadeler, fonksiyon tanımlamak yerine inline şekilde işlem yapmaya imkan verir

In [65]:

def kareal(sayı):
    return sayı**2

#yukarıdaki fonksiyonu tanımlamak yerine lambda yazabilriz
kareal2=lambda x:x**2

print(kareal(10))
print(kareal2(10))

100
100

Stringler¶

Slicing¶

In [65]:

In [66]:

from mypyext.pythonutility import * #kendi yazdığım utility modülünü import ediyorum.
#burada farklı print şekilleri var. print ederken satır numarsını da yazdıran printy gibi.
#View menüsünden Toggle Line Numbers yapmış olmanız lazım.

In [67]:

metin="volkan yurtseven"
printy(metin[0])
printy(metin[:3]) #left 3
printy(metin[4:]) #substr
printy(metin[2:5]) #substr
printy(metin[-1]) #son
printy(metin[-3:]) #right 3
printy(metin[::-1]) #ters

(2, ['printy(metin[0])\n']) 
----------
v
 
(3, ['printy(metin[:3]) #left 3\n']) 
----------
vol
 
(4, ['printy(metin[4:]) #substr\n']) 
----------
an yurtseven
 
(5, ['printy(metin[2:5]) #substr\n']) 
----------
lka
 
(6, ['printy(metin[-1]) #son\n']) 
----------
n
 
(7, ['printy(metin[-3:]) #right 3\n']) 
----------
ven
 
(8, ['printy(metin[::-1]) #ters\n']) 
----------
nevestruy naklov

String formatlama¶

In [68]:

mesaj="İnsanların yaklaşık %s kadın olup %s erkektir" % ("yarısı","kalanı")
mesaj
#s:string, d:sayı

Out[68]:

'İnsanların yaklaşık yarısı kadın olup kalanı erkektir'

In [69]:

#daha çok bu yöntem, {}
mesaj="Python {} bir dildir".format("güzel")
mesaj

Out[69]:

'Python güzel bir dildir'

In [70]:

# ya da + ile basit concat
mesaj="python"
mesaj=mesaj + " güzel bir dildir"
mesaj

Out[70]:

'python güzel bir dildir'

In [71]:

#en son yöntem: f-string / f-literal olarak da geçer
ad="volkan"
yas=41
print(f"Benim adım {ad} olup yaşım {yas}")

Benim adım volkan olup yaşım 41

In [72]:

#f ilteral ile binlik ayraç ve küsurat işleri
dunyanufusu=8000000000
pi=3.14159
print(f"pi sayısı yaklaşık olarak {pi:.2f} olup dünyada yaklaşık {dunyanufusu:,} kişi yaşamaktadır")

pi sayısı yaklaşık olarak 3.14 olup dünyada yaklaşık 8,000,000,000 kişi yaşamaktadır

metinsel fonksiyonlar¶

In [73]:

parçalı=metin.split()
parçalı

Out[73]:

['volkan', 'yurtseven']

In [74]:

print(metin*3)

volkan yurtsevenvolkan yurtsevenvolkan yurtseven

In [75]:

metin.replace("e","i")

Out[75]:

'volkan yurtsivin'

In [76]:

print(metin.upper(), metin.lower(), metin.capitalize(), metin.title())

VOLKAN YURTSEVEN volkan yurtseven Volkan yurtseven Volkan Yurtseven

In [77]:

print(metin.startswith("v"),metin.endswith("d"))

True False

In [78]:

kelime="     naber   dostum     "
print("yeni:"+kelime.strip()+".") #ortadakini silmez

yeni:naber   dostum.

In [79]:

isim="Volkan"
user="ABC123"
yas="42" #tırnak içinde olmalı
mail="volkan.yurtseven@hotmail.com"

print(isim.isalnum(),user.isalnum(),yas.isalnum(),mail.isalnum())
print(isim.isalpha(),user.isalpha(),yas.isalpha(),mail.isalpha())
print(isim.isdigit(),user.isdigit(),yas.isdigit(),mail.isdigit()) #isnumeric de olur. fark için https://stackoverflow.com/questions/44891070/whats-the-difference-between-str-isdigit-isnumeric-and-isdecimal-in-python
print(isim.isprintable(),user.isprintable(),yas.isprintable(),mail.isprintable())

True True True False
True False False False
False False True False
True True True True

Tüm diğer string metodları için şuraya bakabilirsiniz: https://www.w3schools.com/python/python_ref_string.asp

özel karekterler ve literaller¶

In [80]:

#escape
print("Şifre:\"abc123\"")
print("c:\\python\\abc\\xyz\\sdf")
print(r"c:\python\abc\xyz\sdf") #r:raw, escape char'ı görmezden gel demek

Şifre:"abc123"
c:\python\abc\xyz\sdf
c:\python\abc\xyz\sdf

In [81]:

#literaller: b,r,f
a=b"volkan"
b="volkan"
print(type(a))
print(type(b))

<class 'bytes'>
<class 'str'>

In [82]:

import sys
sys.getsizeof(a)
sys.getsizeof(b)

Out[82]:

In [83]:

adres1="c:\\falanca\\filanca"
adres2=r"c:\falanca\filanca"
adres1==adres2

Out[83]:

True

diğer işlemler¶

In [84]:

#liste çevirme
liste=list(metin)
print(liste)

['v', 'o', 'l', 'k', 'a', 'n', ' ', 'y', 'u', 'r', 't', 's', 'e', 'v', 'e', 'n']

In [85]:

#içinde var mı kontrolü
print("l" in metin)
print(metin.find("z")) #bulamazsa -1
try:
  print(metin.index("z")) #bulamazsa hata alır
except:
  print("bulamadı")

True
-1
bulamadı

In [86]:

for m in metin:
    print(m,end="\n")

v
o
l
k
a
n
 
y
u
r
t
s
e
v
e
n

In [87]:

#aralarda boşluk falan varsa "r" başta olacak şekilde kullanırız. c#'taki @ gibi
path=r"E:\falan filan klasörü\sub klasör"

In [88]:

metin.count("e") #metin değişkeninde e harfi kaç kez geçiyor

Out[88]:

string modülü¶

In [89]:

import string

In [90]:

string.punctuation
string.printable
string.whitespace
string.digits
string.ascii_letters

Out[90]:

'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'

Out[90]:

'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c'

Out[90]:

' \t\n\r\x0b\x0c'

Out[90]:

'0123456789'

Out[90]:

'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'

Koşullu yapılar¶

Koşullu yapılar, döngüler ve veri yapıları tüm programalama dillerinin ortak özellikleri olup iyi kavranması gerekirler. Bu konuda kendinizi test edeceğiniz güzel bir site var. Burada çeştli konularda kolaydan zora kadar farklı seviyelerde sorular var, bunları çözüp gönderiyorsunuz, puan kazanıyorsunuz. bu siteyi kullanmanızı tavsiye ederim.

https://www.hackerrank.com

In [91]:

i=10 #bunu sırasıyla 10,20 ve 30 yapark çalışıtırın
if i<20:
    print("20den küçük")
elif i==20: #çift =
    print("tam 20")
else:
    print("20den büyük")

20den küçük

In [92]:

#one-liner -(ternary) if-else
x=3
sonuc="high" if x>10 else "low"
print(sonuc)

low

Döngüler¶

Genelde list, dict gibi veri yapıları içinde dolaşmaya yararlar. Bu veri yapılarını az aşağıda detaylı göreceğiz

iki tür döngü yapımız var. while ve for.
for, foreach şeklindedir, klasik for yok. onun yerine range fonksiyonundan yararlanılabilir.

In [93]:

fruits = ["apple", "banana", "cherry"]
for x in fruits:
    print(x)

apple
banana
cherry

In [94]:

#klasik for için range kullanımı.
for i in range(len(fruits)):
    print(fruits[i])

apple
banana
cherry

In [95]:

#metinler de loop ile dolaşılabilir
isim="volkan"
for i in isim:
    print(i,end="-")

v-o-l-k-a-n-

In [96]:

#döngüden çıkış
fruits = ["apple", "banana", "cherry"]
for x in fruits:
    print(x)
    if x == "banana":
        break

apple
banana

In [97]:

#while ile bir şart gerçekleş(me)tiği sürece döngüde kalırız
i = 1
while i < 6:
    print(i)
    i += 1

In [98]:

#belirli bir ara şart gerçekleşirse döngüden çıkabiliriz
i = 1
while i < 6:
    print(i)
    if i == 3:
        break
    i += 1

1
2
3

In [99]:

#while'ın en sık kullanımlarından biri, kullanıcıdan exit/quit yazana kadar hep girdi almak
while True:
    rakam=input("Karesi alınacak bir sayı giriniz, çıkış için q tuşuna basın:")
    if rakam=="q":
        print("Hoşçakalın")
        break
    else:
        print(str(int(rakam)**2))

Karesi alınacak bir sayı giriniz, çıkış için q tuşuna basın:5
25
Karesi alınacak bir sayı giriniz, çıkış için q tuşuna basın:q
Hoşçakalın

In [100]:

#hackerrank sitesindeki bir ödev
def staircase(n):
    for i in range(n):
        print((n-i-1)*" "+"#"*(i+1))
staircase(6)

     #
    ##
   ###
  ####
 #####
######

Döngü içinde "else" kullanımı¶

for döngülerinde¶

tüm liste bittiğinde son olarak bu kısım yürütülür

In [101]:

for i in range(4):
    print(i)
else:
    print("bitti")

0
1
2
3
bitti

In [102]:

for num in range(10,20):
    for i in range(2,num):
        if num%i==0:
            j=num/i
            print("{} equals {}*{}".format(num,i,j))
            break
    else:
        print(num," bir asal sayıdır")

10 equals 2*5.0
11  bir asal sayıdır
12 equals 2*6.0
13  bir asal sayıdır
14 equals 2*7.0
15 equals 3*5.0
16 equals 2*8.0
17  bir asal sayıdır
18 equals 2*9.0
19  bir asal sayıdır

while döngülerinde¶

koşul sağlanmadığında yürütülür

In [103]:

n=5
while n!=0:
    print(n)
    n-=1
else:
    print("artık sağlanmıyor")

5
4
3
2
1
artık sağlanmıyor

içiçe döngülerden çıkış¶

içiçe döngü varsa, break ifadesi en içteki döngüden çıkar ve o bloktan sonraki ilk satırdan devam eder

iç döngüden çıkış, dış döngüye devam¶

In [104]:

liste=[]
for x in list("bacde"):
    for z in ["ali","dursun","ıtır","emel","cemil"]:
        print(x,z) #kontrol için
        if x in z:
            print(f"{z} ekleniyor ve iç döngüden çıkış yapılacak\n")
            liste.append(z)
            break #bir kez ekledikten sonra çıkıyorum, o yüzden mükerrer ekleme olmuyor, comment/uncomment
    else:
        print(f"{x} için iç döngüde eşleşme bulunamadı\n")
print(liste)

b ali
b dursun
b ıtır
b emel
b cemil
b için iç döngüde eşleşme bulunamadı

a ali
ali ekleniyor ve iç döngüden çıkış yapılacak

c ali
c dursun
c ıtır
c emel
c cemil
cemil ekleniyor ve iç döngüden çıkış yapılacak

d ali
d dursun
dursun ekleniyor ve iç döngüden çıkış yapılacak

e ali
e dursun
e ıtır
e emel
emel ekleniyor ve iç döngüden çıkış yapılacak

['ali', 'cemil', 'dursun', 'emel']

tüm döngüden çıkış¶

In [105]:

#herhangi birinin olması yeterliyse, ilk gördüğümü ekleyip çıkayım
liste=[]
for x in list("bkcde"): #a'yı k yapalım
    for z in ["ali","dursun","ıtır","emel","cemil"]:
        print(x,z) #kontrol için
        if x in z:
            print(f"{z} ekleniyor ve tüm döngüden çıkış yapılacak\n")
            liste.append(z)
            break
    else:
        print(f"dış döngüdeki {x} için tur tamamlandı,sonraki için devam\n")
        continue
    break #iç döngüden çıkıldığında buraya gelinir

print(liste)

b ali
b dursun
b ıtır
b emel
b cemil
dış döngüdeki b için tur tamamlandı,sonraki için devam

k ali
k dursun
k ıtır
k emel
k cemil
dış döngüdeki k için tur tamamlandı,sonraki için devam

c ali
c dursun
c ıtır
c emel
c cemil
cemil ekleniyor ve tüm döngüden çıkış yapılacak

['cemil']

In [106]:

#iki dizidekilerin toplamı 20den büyük olduğunda çık
dizi=[[11,21,3],[5,15,6]]
records=[]

for j in dizi[0]:
    for i in dizi[1]:
        if j+i>20:
            records.append((j,i,j+i))
            break
    else:
        continue
    break

records

Out[106]:

[(11, 15, 26)]

In [107]:

#2.yöntem. bi fonk içinte return kullanmak
records=[]
def myfonk():
    dizi=[[11,21,3],[5,15,6]]
    for j in dizi[0]:
        for i in dizi[1]:
            if j+i>20:
                records.append((j,i,j+i))
                return

myfonk()
print(records)

[(11, 15, 26)]

In [108]:

#3.yöntem: exception
records=[]
try:
    dizi=[[11,21,3],[5,15,6]]
    for j in dizi[0]:
        for i in dizi[1]:
            if j+i>20:
                records.append((j,i,j+i))
                raise StopIteration
except StopIteration: pass
records

Out[108]:

[(11, 15, 26)]

Data Structures(Veri yapıları)¶

List¶

In [109]:

liste=[0,1,2,3,4,5]
liste.append(6)
print(liste[:2]) #stringler gibi slicing yapılır
print(3 in liste) #üyelik kontrolü

[0, 1]
True

In [110]:

son=liste.pop() #son elemanı çıkarıp buna atar
print(son)
print(liste)

6
[0, 1, 2, 3, 4, 5]

In [111]:

rangelist=list(range(0,100,3))
print(rangelist)

[0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51, 54, 57, 60, 63, 66, 69, 72, 75, 78, 81, 84, 87, 90, 93, 96, 99]

Sıralama¶

Sort metodu bir listeyi kendi üstünde sıralar, sonuç olarak birşey döndürmez. yani sıralanmış listeyi bir değişkene atayamayız. sıralanmış halini başka bir değişkene atamak istersek sorted fonkisyonunu kullanırız.

In [112]:

meyveler=["elma","muz","portakal","çilek","üzüm","armut","muz"]
printy(meyveler.index("muz")) #ilk gördüğün indeksi
printy(meyveler.count("muz"))
meyveler.sort()
printy(meyveler)
meyveler.reverse()
printy(meyveler)

(2, ['printy(meyveler.index("muz")) #ilk gördüğün indeksi\n']) 
----------
1
 
(3, ['printy(meyveler.count("muz"))\n']) 
----------
2
 
(5, ['printy(meyveler)\n']) 
----------
['armut', 'elma', 'muz', 'muz', 'portakal', 'çilek', 'üzüm']
 
(7, ['printy(meyveler)\n']) 
----------
['üzüm', 'çilek', 'portakal', 'muz', 'muz', 'elma', 'armut']

In [113]:

siralimeyveler=sorted(meyveler,reverse=True) #ayrıca tersten sırala demiş olduk. bu parametre normal sort metodunda da var
print(siralimeyveler)

['üzüm', 'çilek', 'portakal', 'muz', 'muz', 'elma', 'armut']

Tuple¶

List gibi ama değişmez yapılardır yani eleman eklenip çıkarılamaz. [] yerine () veya parantezsiz

In [114]:

tpl=(1,2,3)
tpl2=1,2,3
print(type(tpl2))

<class 'tuple'>

Comprehension¶

tüm veri yapılarıyla uygulanabilir. uzun döngü yazmaktan kurtarır. c#'taki LINQ işlemlerinin benzer hatta daha güzel alternatifi

In [115]:

rangelistinikikatı=[x*2 for x in rangelist]
print(rangelistinikikatı)

[0, 6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66, 72, 78, 84, 90, 96, 102, 108, 114, 120, 126, 132, 138, 144, 150, 156, 162, 168, 174, 180, 186, 192, 198]

koşullu comprehension¶

[x for x in datastruct if x ...]
[x if ... else y for x in datastruct]

In [116]:

kısaisimlimeyveler=[x for x in meyveler if len(x)<5]
kısaisimlimeyveler

Out[116]:

['üzüm', 'muz', 'muz', 'elma']

In [117]:

liste=range(1,10)
sadecetekler=[sayı for sayı in liste if sayı % 2 !=0] #tek if
tekler=[sayı if sayı%2!=0 else "" for sayı in liste] #if-else
print(sadecetekler)
print(tekler)

[1, 3, 5, 7, 9]
[1, '', 3, '', 5, '', 7, '', 9]

içiçe(nested) list comprehension¶

**syntax:[x for iç in dış for x in iç]**

2 boyutlu bir matrisi düzleştirmek istiyorum
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
Beklediğimiz çıktı: flatten_matrix = [1, 2, 3, 4, 5, 6, 7, 8, 9]

In [118]:

# 2-D List
matrix = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]

flatten_matrix = []

for sublist in matrix:
	for val in sublist:
		flatten_matrix.append(val)

print(flatten_matrix)

[1, 2, 3, 4, 5, 6, 7, 8, 9]

In [119]:

# 2-D List
matrix = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]

# Nested List Comprehension to flatten a given 2-D matrix
flatten_matrix = [val for sublist in matrix for val in sublist]

print(flatten_matrix)

[1, 2, 3, 4, 5, 6, 7, 8, 9]

In [120]:

# 2-D List of planets
planets = [['Mercury', 'Venus', 'Earth'], ['Mars', 'Jupiter', 'Saturn'], ['Uranus', 'Neptune', 'Pluto']]

flatten_planets = []

for sublist in planets:
	for planet in sublist:

		if len(planet) < 6:
			flatten_planets.append(planet)

print(flatten_planets)

['Venus', 'Earth', 'Mars', 'Pluto']

In [121]:

flatten_planets = [planet for sublist in planets for planet in sublist if len(planet) < 6]
print(flatten_planets)

['Venus', 'Earth', 'Mars', 'Pluto']

In [122]:

kısalar=[p for iç in planets for p in iç if len(p)<6]
kısalar

Out[122]:

['Venus', 'Earth', 'Mars', 'Pluto']

Daha genel bir gösterim için https://stackoverflow.com/questions/18072759/list-comprehension-on-a-nested-list sayfasındaki gif animasyonlu açıklamaya bakınız

Matrisler ve matrislerde comprehension¶

In [123]:

matris=[
    [1,2,3],
    [4,5,6],
    [7,8,9]
]
print(len(matris))
printy([satır for satır in matris]) #satır satır
printy([satır[0] for satır in matris]) #ilk sütun
printy([[satır[i] for satır in matris] for i in range(3)]) #sütun sütun, transpozesi
printy([x for iç in matris for x in iç]) #nested

3
(7, ['printy([satır for satır in matris]) #satır satır\n']) 
----------
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
 
(8, ['printy([satır[0] for satır in matris]) #ilk sütun\n']) 
----------
[1, 4, 7]
 
(9, ['printy([[satır[i] for satır in matris] for i in range(3)]) #sütun sütun, transpozesi\n']) 
----------
[[1, 4, 7], [2, 5, 8], [3, 6, 9]]
 
(10, ['printy([x for iç in matris for x in iç]) #nested\n']) 
----------
[1, 2, 3, 4, 5, 6, 7, 8, 9]

amaç aşağıdakini elde etmek olsun

matrix = [[0, 1, 2, 3, 4],
          [0, 1, 2, 3, 4],
          [0, 1, 2, 3, 4],
          [0, 1, 2, 3, 4],
          [0, 1, 2, 3, 4]]

In [124]:

matrix = []

for i in range(5):

	# Append an empty sublist inside the list
	matrix.append([])

	for j in range(5):
		matrix[i].append(j)

print(matrix)

[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]

In [125]:

# Nested list comprehension
matrix = [[j for j in range(5)] for i in range(5)]

print(matrix)

[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]

Generators¶

Bu konu biraz daha advanced bi konu olup ben sadece toplam alınması gereken durumlar için bir öenride bulunucam. Bi list comprehension sonunda(özellikle çok büyük bi list sözkonusuya) toplam alınacaksa [] kullanamya gerek yok, böylece memory tasarrufu yapmış olursunuz. Detaylar için şu sayfaya bakabilirsiniz: https://www.johndcook.com/blog/2020/01/15/generator-expression/

In [126]:

#list comprehension: tüm liste elemanları bellekte tutuluyor
sum([x**2 for x in range(10)])

Out[126]:

In [127]:

#generator expression: liste elemanları bellekte tutulmuyor
sum(x**2 for x in range(10))

Out[127]:

Stack¶

Normalde böyle bi sınıf yok. list'i stack gibi kullanırız. append ve pop sayesinde. ilk giren ilk çıkar

In [128]:

stack=[1,2,3]
stack.append(4)
stack.pop()
stack

Out[128]:

[1, 2, 3]

Queue¶

Bunu da istersek listten yaparız, ilk giren son çıkar. ama bunun için collections modülünde bi sınıf var

In [129]:

from collections import deque
kuyruk=deque([1,2,3])
kuyruk.append(4)
sıradaki=kuyruk.popleft()
print(sıradaki)
print(kuyruk)

1
deque([2, 3, 4])

Dictionary¶

Key-value ikililerini tutarlar. Sırasızdırlar(EDIT:Python 3.7den itibaren girdilen sırayı korur), indeksle ulaşamayız. key'lerle valuelara ulaşırız veya döngü içinde dolanarak ikisine birden tek seferde de ulaşabiliriz.

Yaratım¶

Klasik¶

In [130]:

dict_={}
dict_["one"]="bir" #add,append, insert gibi bir metodu yok, direkt atanıyor
dict_["two"]="iki"
dict_["two"]="zwei"
printy(dict_.keys())
printy(dict_.values())
printy(dict_.items())
print(dict_["one"])
#print(dict_["three"]) # hata alır, almaması için get kullan
print(dict_.get("three","N/A"))

(5, ['printy(dict_.keys())\n']) 
----------
dict_keys(['one', 'two'])
 
(6, ['printy(dict_.values())\n']) 
----------
dict_values(['bir', 'zwei'])
 
(7, ['printy(dict_.items())\n']) 
----------
dict_items([('one', 'bir'), ('two', 'zwei')])
 
bir
N/A

dict metodu ile ikili elemanlardan oluşan bir yapıdan¶

bu ikili yapılar genelde zip veya enumerate olacaktır. bakınız ilgili fonksiyonar.

In [131]:

tpl=[("one","bir"),("two","iki"),("three","üç")]
dict_=dict(tpl)
print(type(dict_))
dict_["one"]

<class 'dict'>

Out[131]:

'bir'

comprehension ile¶

In [132]:

sayılar=list(range(10))
ciftlerinkaresi={x: x**2 for x in sayılar if x%2==0}
print(ciftlerinkaresi.items())

dict_items([(0, 0), (2, 4), (4, 16), (6, 36), (8, 64)])

elemanlarda dolaşma¶

In [133]:

for k,v in ciftlerinkaresi.items():
    print(k,v)

çeşitli metodlar¶

In [134]:

try:
  ciftlerinkaresi.clear()
  ciftlerinkaresi.items()
  del ciftlerinkaresi
  ciftlerinkaresi #hata verir, artık bellekten uçtu
except Exception as err:
  print(err)

Out[134]:

dict_items([])

name 'ciftlerinkaresi' is not defined

Set¶

Bunlar da dict gibi sırasızdır. dict gibi {} içinde tanımlanırlar. uniqe değerleri tutarlar. bir listteki duplikeleri ayırmak ve membership kontrolü için çok kullanılırlar

In [135]:

liste=[1,1,2,3,4,4,5]
set_=set(liste)
set_

Out[135]:

{1, 2, 3, 4, 5}

In [136]:

set1={1,2,3,4,5}
set2={2,3,4}
set3={2,3,4,5,6}

printy(set1,set2,set3)
printy("----diff")
printy(set1.difference(set2))
printy(set1.difference(set3))
printy(set2.difference(set1))
printy(set2.difference(set3))
printy(set3.difference(set1))
printy(set3.difference(set2))
printy("- intersection----")
printy(set1.intersection(set2))
printy(set1.intersection(set3))
printy(set2.intersection(set1))
printy(set2.intersection(set3))
printy(set3.intersection(set1))
printy(set3.intersection(set2))
printy("----union---")
printy(set1.union(set2))
printy(set1.union(set3))
printy(set2.union(set1))
printy(set2.union(set3))
printy(set3.union(set1))
printy(set3.union(set2))

(5, ['printy(set1,set2,set3)\n']) 
----------
{1, 2, 3, 4, 5} {2, 3, 4} {2, 3, 4, 5, 6}
 
(6, ['printy("----diff")\n']) 
----------
----diff
 
(7, ['printy(set1.difference(set2))\n']) 
----------
{1, 5}
 
(8, ['printy(set1.difference(set3))\n']) 
----------
{1}
 
(9, ['printy(set2.difference(set1))\n']) 
----------
set()
 
(10, ['printy(set2.difference(set3))\n']) 
----------
set()
 
(11, ['printy(set3.difference(set1))\n']) 
----------
{6}
 
(12, ['printy(set3.difference(set2))\n']) 
----------
{5, 6}
 
(13, ['printy("- intersection----")\n']) 
----------
- intersection----
 
(14, ['printy(set1.intersection(set2))\n']) 
----------
{2, 3, 4}
 
(15, ['printy(set1.intersection(set3))\n']) 
----------
{2, 3, 4, 5}
 
(16, ['printy(set2.intersection(set1))\n']) 
----------
{2, 3, 4}
 
(17, ['printy(set2.intersection(set3))\n']) 
----------
{2, 3, 4}
 
(18, ['printy(set3.intersection(set1))\n']) 
----------
{2, 3, 4, 5}
 
(19, ['printy(set3.intersection(set2))\n']) 
----------
{2, 3, 4}
 
(20, ['printy("----union---")\n']) 
----------
----union---
 
(21, ['printy(set1.union(set2))\n']) 
----------
{1, 2, 3, 4, 5}
 
(22, ['printy(set1.union(set3))\n']) 
----------
{1, 2, 3, 4, 5, 6}
 
(23, ['printy(set2.union(set1))\n']) 
----------
{1, 2, 3, 4, 5}
 
(24, ['printy(set2.union(set3))\n']) 
----------
{2, 3, 4, 5, 6}
 
(25, ['printy(set3.union(set1))\n']) 
----------
{1, 2, 3, 4, 5, 6}
 
(26, ['printy(set3.union(set2))\n']) 
----------
{2, 3, 4, 5, 6}

Not: Yukarıdaki altalta aynı hizada olan tüm printy ifadesini tek seferde yapmanın yolu var. ben mesela bunların hepsi print iken printy'yi tek seferde yaptım. Alt tuşuna basarak seçmek. aşağıdaki gibi seçip t tuşuna basarsam tüm ty'ler t olur.

Zip¶

In [137]:

x=[1,2,3]
y=[10,20,30]
onkatlar=zip(x,y)
#print(list(onkatlar)) #yazdırmak için liste çevir. bi kez liste çevirlince artık zip özelliği kalmaz,
#o yüzden alttaki blok çalışmaz,o yüzden geçici olarak commentledim. deneyin ve görün

In [138]:

#tekrar ayırmak için
x2, y2 = zip(*onkatlar)
x2

Out[138]:

(1, 2, 3)

Zip vs Dict¶

In [139]:

a=[1,2,3]
b=[10,20,30]
c=zip(a,b)
for i,j in c:
    print(i,j)

1 10
2 20
3 30

In [140]:

a=[1,2,3]
b=[10,20,30]
c=zip(a,b)
print(type(list(c)[0]))
dict_=dict(c) #zipten dict üretimi
for k,v in dict_.items():
    print(k,v)

<class 'tuple'>

Listlerle kullanılan önemli fonksiyonlar¶

Map ve Reduce¶

Map: bir veri yapısındaki elemanları sırayla bir fonksiyona gönderir ve sonuç yine bir veri yapısıdır
Reduce: elemanları sırayla gönderir, bir eritme mantığı var, her bir önceki elamnını sonucyla bir sonraki eleman işleme girer

Map¶

In [141]:

items=[1,2,3,4,5]
def kareal(sayı):
    return sayı**2

kareler=map(kareal,items) #lambdalı da olur. map(lambda x: x**2, items)
list(kareler) #yazdırmak için liste çevir

Out[141]:

[1, 4, 9, 16, 25]

In [142]:

#birden fazla veri yapısı da girebilir işleme
range1=range(1,10)
range2=range(1,20,2)

mymap=map(lambda x,y:x*y,range1,range2)
for i in mymap:
    print(i)

In [143]:

#comprehensionla da yapılabilir.
çarpım=[x*y for x,y in zip(range1,range2)]
çarpım

Out[143]:

[1, 6, 15, 28, 45, 66, 91, 120, 153]

In [144]:

harfler=map(chr,range(97,112))
list(harfler)

Out[144]:

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o']

In [145]:

harfler2=[chr(x) for x in range(97,112)]
harfler2

Out[145]:

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o']

Reduce¶

In [146]:

from functools import reduce
def faktoriyel(sayı1,sayı2):
    return sayı1*sayı2

sayılar=range(1,10)
fakt=reduce(faktoriyel,sayılar)
fakt

Out[146]:

Filter¶

In [147]:

ages = [5, 12, 17, 18, 24, 32]

def myFunc(x):
    if x < 18:
        return False
    else:
        return True

adults = filter(myFunc, ages)

for x in adults:
    print(x)

18
24
32

In [148]:

#veya lambda ile
list(filter(lambda x:x>=18,ages))

Out[148]:

[18, 24, 32]

Enumerate¶

In [149]:

aylar=["Ocak","Şubat","Mart"]
print(list(enumerate(aylar,start=1)))
dict_=dict(enumerate(aylar,start=1))
for k,v in dict_.items():
    print(k,v)

[(1, 'Ocak'), (2, 'Şubat'), (3, 'Mart')]
1 Ocak
2 Şubat
3 Mart

All ve Any¶

In [150]:

liste1=[True,True,False]
liste2=[True,True,True]
print(all(liste1))
print(any(liste1))
print(all(liste2))

False
True
True

Date Time işlemleri¶

Modüller ve üyeleri¶

In [151]:

import datetime
import time
import timeit
import dateutil
import calendar
#import timer as tr #bu threadlerle ilgili kullanma, settimer ve killtimer var

bunlardan en çok kullanılan datetime olup, bu modül içinde bir de datetime tipi vardır. HAngisini import edeceğinize karar vermek önemli. ernizde olsam sadece modül olanı import eder, sonraki alt tip ve diğer üyeleri bunun üzeirnden çağırırım

datetime¶

In [152]:

print([i for i in dir(datetime) if not "__" in i])

['MAXYEAR', 'MINYEAR', 'date', 'datetime', 'datetime_CAPI', 'sys', 'time', 'timedelta', 'timezone', 'tzinfo']

In [153]:

datetime.datetime.now()
datetime.datetime.now().hour #saliseden yıla kadar hepsi elde edilebili
datetime.date.today()
datetime.date.today().year
datetime.date(2019,4,3)

Out[153]:

datetime.datetime(2024, 9, 22, 15, 25, 25, 30003)

Out[153]:

datetime.date(2024, 9, 22)

Out[153]:

datetime.date(2019, 4, 3)

In [154]:

datetime.datetime.now().day # yada dt.date.today().day

Out[154]:

time (süre ölçümlerinde bunu kullancaz, bundaki time metodunu)¶

In [155]:

print([i for i in dir(time) if not "__" in i])

['CLOCK_BOOTTIME', 'CLOCK_MONOTONIC', 'CLOCK_MONOTONIC_RAW', 'CLOCK_PROCESS_CPUTIME_ID', 'CLOCK_REALTIME', 'CLOCK_TAI', 'CLOCK_THREAD_CPUTIME_ID', '_STRUCT_TM_ITEMS', 'altzone', 'asctime', 'clock_getres', 'clock_gettime', 'clock_gettime_ns', 'clock_settime', 'clock_settime_ns', 'ctime', 'daylight', 'get_clock_info', 'gmtime', 'localtime', 'mktime', 'monotonic', 'monotonic_ns', 'perf_counter', 'perf_counter_ns', 'process_time', 'process_time_ns', 'pthread_getcpuclockid', 'sleep', 'strftime', 'strptime', 'struct_time', 'thread_time', 'thread_time_ns', 'time', 'time_ns', 'timezone', 'tzname', 'tzset']

In [156]:

print([i for i in dir(timeit) if not "__" in i])

['Timer', '_globals', 'default_number', 'default_repeat', 'default_timer', 'dummy_src_name', 'gc', 'itertools', 'main', 'reindent', 'repeat', 'sys', 'template', 'time', 'timeit']

dateutil¶

In [157]:

print([i for i in dir(dateutil) if not "__" in i])

['_common', '_version', 'parser', 'relativedelta', 'tz']

Calendar¶

In [158]:

calendar.mdays

Out[158]:

[0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]

In [159]:

calendar.Calendar().monthdayscalendar(datetime.date.today().year,datetime.date.today().month)

Out[159]:

[[0, 0, 0, 0, 0, 0, 1],
 [2, 3, 4, 5, 6, 7, 8],
 [9, 10, 11, 12, 13, 14, 15],
 [16, 17, 18, 19, 20, 21, 22],
 [23, 24, 25, 26, 27, 28, 29],
 [30, 0, 0, 0, 0, 0, 0]]

In [160]:

#bu aydaki gün sayısı
calendar.mdays[datetime.date.today().month]

Out[160]:

Tarih farkları¶

In [161]:

datetime.date.today()-datetime.date(2024,4,3)

Out[161]:

datetime.timedelta(days=172)

In [162]:

#gün farkını alalım
(datetime.date.today()-datetime.date(2024,4,3)).days

Out[162]:

In [163]:

#30 gün sonrası
datetime.date.today()+datetime.timedelta(days=30)
#1 ay sonra(bazen 30, bazen 31, hatta 28/29 olabilir)
datetime.date.today()+datetime.timedelta(days=calendar.mdays[datetime.date.today().month])

Out[163]:

datetime.date(2024, 10, 22)

Out[163]:

datetime.date(2024, 10, 22)

Dönüşümler¶

In [164]:

# strftime: datetime nesnesini string'e dönüştürme
now = datetime.datetime.now()
date_string = now.strftime("%Y-%m-%d %H:%M:%S")
print(date_string)

# strptime: string'i datetime nesnesine dönüştürme
date_string = "2023-10-26 15:30:00"
datetime_object = datetime.datetime.strptime(date_string, "%Y-%m-%d %H:%M:%S")
print(datetime_object, type(datetime_object))

2024-09-22 15:25:25
2023-10-26 15:30:00 <class 'datetime.datetime'>

datetime.strptime fonksiyonu, belirli bir formattaki tarih ve saat bilgilerini ayrıştırmak için kullanılırken, yukarıda gördüğümüz dateutil.parser fonksiyonu daha genel bir ayrıştırıcıdır ve birçok farklı formatı tanıyabilir.

dateutil.parser fonksiyonu, datetime.strptime fonksiyonundan daha esnektir ve daha fazla tarih ve saat formatını destekler. Ayrıca, dateutil.parser fonksiyonu, bilinmeyen formattaki tarih ve saat bilgilerini ayrıştırmak için de kullanılabilir.

In [165]:

from_util = dateutil.parser.parse(date_string)
print(from_util, type(from_util))

2023-10-26 15:30:00 <class 'datetime.datetime'>

Süre ölçümü¶

Performans amaçlı süre ölçümü(sonuçtan bağımsız)¶

%'li olanı bir fonksiyon takip eder, %%'li kullanımda ise alt satırdan yazarsın.
Time olan tek seferlik run'ın süresini verirken timeit onlarca kez çalıştırıp ortalama süre verir

In [166]:

%%timeit
x=sum(range(1000000))

37.7 ms ± 2.34 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [167]:

%%time
x=sum(range(1000000))

CPU times: user 35.5 ms, sys: 0 ns, total: 35.5 ms
Wall time: 35.6 ms

In [168]:

def hesapla():
    y=sum(range(1000000))

%timeit hesapla()

25.7 ms ± 7.89 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Süreyle birlikte sonuç görme¶

In [169]:

bas=time.time()
print("hey")
hesapla()
bit=time.time()
print("süre:{}".format(bit-bas))

hey
süre:0.019402027130126953

jupyter nbextensions¶

nbextensionsı hala kurmadıysanız alın size bir sebep daha. bu süre ölçümü kodlarına gerek yok. yukarıdaki eklentiyle her hücrenin çalışma süresini veriyor. Bir örnek:

Önemli bazı modüller¶

Aşağıda önemli olduğunu düşündüğüm bazı modülleri bulacaksınız. Tabiki bunların dışında da başak kütüphaneler/modüller var, bunları da zamanla öğreneceksiniz.

os¶

İşletim sisteminden bağımsız çalışacak bir koda(dosyalama sistemi v.s ile ilgili) ihtiyacımız olduğunda os modülünü kullanırız.

Klasör ve dosya işlemleri¶

In [170]:

import os
os.getcwd() #o anki aktif folder
os.curdir #mevcut dizini temsil eden karakter, genelde bu . oluyor. Bunu daha çok prefix path olarak kullanırız

Out[170]:

'/content'

Out[170]:

'.'

In [171]:

os.listdir() #o anki aktif folderdaki dosyaları listeler

Out[171]:

['.config', 'drive', 'sample_data']

In [172]:

os.chdir("drive") #aktif klasörü değiştiriyoruz
os.listdir()

Out[172]:

['Othercomputers',
 '.file-revisions-by-id',
 'MyDrive',
 '.shortcut-targets-by-id',
 '.Trash-0']

In [173]:

os.pardir #parent klasörü temsil eden karakter

Out[173]:

'..'

In [174]:

os.listdir(os.pardir)

Out[174]:

['.config', 'drive', 'sample_data']

In [175]:

os.path.exists('Python Genel.ipynb') #bir dosya/klasör mevcut mu

Out[175]:

False

In [176]:

os.path.isfile('olmayandosya.txt') #parametredeki bir dosya mı? veya bir dosya mevcut mu anlamında da kullanılabilir.

Out[176]:

False

In [177]:

os.path.isdir("klasor")

Out[177]:

False

In [178]:

# list all files in a directory in Python.
from os import listdir
from os.path import isfile, join
p=os.getcwd()
files_list = [f for f in listdir(p) if isfile(join(p, f))]
print(files_list);

[]

In [179]:

dizin= "/content/drive/MyDrive/Colab Notebooks"
for kökdizin, altdizinler, dosyalar in os.walk(dizin):
    print(kökdizin)

/content/drive/MyDrive/Colab Notebooks
/content/drive/MyDrive/Colab Notebooks/.config
/content/drive/MyDrive/Colab Notebooks/.config/startup

In [180]:

for kökdizin, altdizinler, dosyalar in os.walk(dizin):
    for dosya in dosyalar:
        print(os.sep.join([kökdizin, dosya]))

/content/drive/MyDrive/Colab Notebooks/Snippets: Drive adlı not defterinin kopyası
/content/drive/MyDrive/Colab Notebooks/Snippets: Accessing files adlı not defterinin kopyası
/content/drive/MyDrive/Colab Notebooks/Data Table Display adlı not defterinin kopyası
/content/drive/MyDrive/Colab Notebooks/Forms adlı not defterinin kopyası
/content/drive/MyDrive/Colab Notebooks/gemini.ipynb adlı not defterinin kopyası
/content/drive/MyDrive/Colab Notebooks/Harici veriler: Yerel Dosyalar, Drive, E-Tablolar ve Cloud Storage adlı not defterinin kopyası
/content/drive/MyDrive/Colab Notebooks/_My Custom Snippets.ipynb
/content/drive/MyDrive/Colab Notebooks/_Colab Aboneliğinizden En İyi Şekilde Yararlanma
/content/drive/MyDrive/Colab Notebooks/_pathler, drive, github.ipynb
/content/drive/MyDrive/Colab Notebooks/pandas.ipynb adlı not defterinin kopyası
/content/drive/MyDrive/Colab Notebooks/.config/startup/starters.py

In [181]:

%cd "MyDrive"

/content/drive/MyDrive

In [182]:

!touch abc.txt #dosya yaratalım

In [183]:

import os.path, time, datetime
print("Last modified: %s" % os.path.getmtime("abc.txt")) #formatlanmamış
print("Created: %s" % time.ctime(os.path.getctime("abc.txt")))

Last modified: 1727018731.0
Created: Sun Sep 22 15:25:31 2024

In [184]:

!rm abc.txt #tekrar silelim

In [185]:

#Sadece windowsta. Linux için-->https://stackoverflow.com/questions/17317219/is-there-an-platform-independent-equivalent-of-os-startfile
# os.startfile('calc.exe') #dosya açar, program başaltır, internet sitesine gider v.s

In [186]:

# os.startfile("/falanca/klasördeki/filanca/dosya")

In [187]:

# os.startfile("www.volkanyurtseven.com")

Sistem ve Environment bilgileri¶

In [188]:

import platform, os
print(os.name)
print(platform.system())
print(platform.release())

posix
Linux
6.1.85+

In [189]:

dict(os.environ)

Out[189]:

{'SHELL': '/bin/bash',
 'NV_LIBCUBLAS_VERSION': '12.2.5.6-1',
 'NVIDIA_VISIBLE_DEVICES': 'all',
 'COLAB_JUPYTER_TRANSPORT': 'ipc',
 'NV_NVML_DEV_VERSION': '12.2.140-1',
 'NV_CUDNN_PACKAGE_NAME': 'libcudnn8',
 'CGROUP_MEMORY_EVENTS': '/sys/fs/cgroup/memory.events /var/colab/cgroup/jupyter-children/memory.events',
 'NV_LIBNCCL_DEV_PACKAGE': 'libnccl-dev=2.19.3-1+cuda12.2',
 'NV_LIBNCCL_DEV_PACKAGE_VERSION': '2.19.3-1',
 'VM_GCE_METADATA_HOST': '169.254.169.253',
 'HOSTNAME': '51c8695c6b32',
 'LANGUAGE': 'en_US',
 'TBE_RUNTIME_ADDR': '172.28.0.1:8011',
 'COLAB_TPU_1VM': '',
 'GCE_METADATA_TIMEOUT': '3',
 'NVIDIA_REQUIRE_CUDA': 'cuda>=12.2 brand=tesla,driver>=470,driver<471 brand=unknown,driver>=470,driver<471 brand=nvidia,driver>=470,driver<471 brand=nvidiartx,driver>=470,driver<471 brand=geforce,driver>=470,driver<471 brand=geforcertx,driver>=470,driver<471 brand=quadro,driver>=470,driver<471 brand=quadrortx,driver>=470,driver<471 brand=titan,driver>=470,driver<471 brand=titanrtx,driver>=470,driver<471 brand=tesla,driver>=525,driver<526 brand=unknown,driver>=525,driver<526 brand=nvidia,driver>=525,driver<526 brand=nvidiartx,driver>=525,driver<526 brand=geforce,driver>=525,driver<526 brand=geforcertx,driver>=525,driver<526 brand=quadro,driver>=525,driver<526 brand=quadrortx,driver>=525,driver<526 brand=titan,driver>=525,driver<526 brand=titanrtx,driver>=525,driver<526',
 'NV_LIBCUBLAS_DEV_PACKAGE': 'libcublas-dev-12-2=12.2.5.6-1',
 'NV_NVTX_VERSION': '12.2.140-1',
 'COLAB_JUPYTER_IP': '172.28.0.12',
 'NV_CUDA_CUDART_DEV_VERSION': '12.2.140-1',
 'NV_LIBCUSPARSE_VERSION': '12.1.2.141-1',
 'COLAB_LANGUAGE_SERVER_PROXY_ROOT_URL': 'http://172.28.0.1:8013/',
 'NV_LIBNPP_VERSION': '12.2.1.4-1',
 'NCCL_VERSION': '2.19.3-1',
 'KMP_LISTEN_PORT': '6000',
 'TF_FORCE_GPU_ALLOW_GROWTH': 'true',
 'ENV': '/root/.bashrc',
 'PWD': '/',
 'TBE_EPHEM_CREDS_ADDR': '172.28.0.1:8009',
 'COLAB_LANGUAGE_SERVER_PROXY_REQUEST_TIMEOUT': '30s',
 'TBE_CREDS_ADDR': '172.28.0.1:8008',
 'NV_CUDNN_PACKAGE': 'libcudnn8=8.9.6.50-1+cuda12.2',
 'NVIDIA_DRIVER_CAPABILITIES': 'compute,utility',
 'COLAB_JUPYTER_TOKEN': '',
 'LAST_FORCED_REBUILD': '20240627',
 'NV_NVPROF_DEV_PACKAGE': 'cuda-nvprof-12-2=12.2.142-1',
 'NV_LIBNPP_PACKAGE': 'libnpp-12-2=12.2.1.4-1',
 'NV_LIBNCCL_DEV_PACKAGE_NAME': 'libnccl-dev',
 'TCLLIBPATH': '/usr/share/tcltk/tcllib1.20',
 'NV_LIBCUBLAS_DEV_VERSION': '12.2.5.6-1',
 'NVIDIA_PRODUCT_NAME': 'CUDA',
 'COLAB_KERNEL_MANAGER_PROXY_HOST': '172.28.0.12',
 'NV_LIBCUBLAS_DEV_PACKAGE_NAME': 'libcublas-dev-12-2',
 'NV_CUDA_CUDART_VERSION': '12.2.140-1',
 'COLAB_WARMUP_DEFAULTS': '1',
 'HOME': '/root',
 'LANG': 'en_US.UTF-8',
 'COLUMNS': '100',
 'CUDA_VERSION': '12.2.2',
 'CLOUDSDK_CONFIG': '/content/.config',
 'NV_LIBCUBLAS_PACKAGE': 'libcublas-12-2=12.2.5.6-1',
 'NV_CUDA_NSIGHT_COMPUTE_DEV_PACKAGE': 'cuda-nsight-compute-12-2=12.2.2-1',
 'COLAB_RELEASE_TAG': 'release-colab_20240919-060125_RC00',
 'KMP_TARGET_PORT': '9000',
 'KMP_EXTRA_ARGS': '--logtostderr --listen_host=172.28.0.12 --target_host=172.28.0.12 --tunnel_background_save_url=https://colab.research.google.com/tun/m/cc48301118ce562b961b3c22d803539adc1e0c19/m-s-2x16lz196cw9 --tunnel_background_save_delay=10s --tunnel_periodic_background_save_frequency=30m0s --enable_output_coalescing=true --output_coalescing_required=true --gorilla_ws_opt_in --log_code_content',
 'NV_LIBNPP_DEV_PACKAGE': 'libnpp-dev-12-2=12.2.1.4-1',
 'COLAB_LANGUAGE_SERVER_PROXY_LSP_DIRS': '/datalab/web/pyright/typeshed-fallback/stdlib,/usr/local/lib/python3.10/dist-packages',
 'NV_LIBCUBLAS_PACKAGE_NAME': 'libcublas-12-2',
 'COLAB_KERNEL_MANAGER_PROXY_PORT': '6000',
 'CLOUDSDK_PYTHON': 'python3',
 'NV_LIBNPP_DEV_VERSION': '12.2.1.4-1',
 'NO_GCE_CHECK': 'False',
 'PYTHONPATH': '/env/python',
 'NV_LIBCUSPARSE_DEV_VERSION': '12.1.2.141-1',
 'LIBRARY_PATH': '/usr/local/cuda/lib64/stubs',
 'NV_CUDNN_VERSION': '8.9.6.50',
 'SHLVL': '0',
 'NV_CUDA_LIB_VERSION': '12.2.2-1',
 'COLAB_LANGUAGE_SERVER_PROXY': '/usr/colab/bin/language_service',
 'NVARCH': 'x86_64',
 'NV_CUDNN_PACKAGE_DEV': 'libcudnn8-dev=8.9.6.50-1+cuda12.2',
 'NV_CUDA_COMPAT_PACKAGE': 'cuda-compat-12-2',
 'NV_LIBNCCL_PACKAGE': 'libnccl2=2.19.3-1+cuda12.2',
 'LD_LIBRARY_PATH': '/usr/local/nvidia/lib:/usr/local/nvidia/lib64',
 'COLAB_GPU': '',
 'NV_CUDA_NSIGHT_COMPUTE_VERSION': '12.2.2-1',
 'GCS_READ_CACHE_BLOCK_SIZE_MB': '16',
 'NV_NVPROF_VERSION': '12.2.142-1',
 'LC_ALL': 'en_US.UTF-8',
 'COLAB_FILE_HANDLER_ADDR': 'localhost:3453',
 'PATH': '/opt/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/tools/node/bin:/tools/google-cloud-sdk/bin',
 'NV_LIBNCCL_PACKAGE_NAME': 'libnccl2',
 'COLAB_DEBUG_ADAPTER_MUX_PATH': '/usr/local/bin/dap_multiplexer',
 'NV_LIBNCCL_PACKAGE_VERSION': '2.19.3-1',
 'PYTHONWARNINGS': 'ignore:::pip._internal.cli.base_command',
 'DEBIAN_FRONTEND': 'noninteractive',
 'COLAB_BACKEND_VERSION': 'next',
 'OLDPWD': '/',
 'JPY_PARENT_PID': '92',
 'TERM': 'xterm-color',
 'CLICOLOR': '1',
 'PAGER': 'cat',
 'GIT_PAGER': 'cat',
 'MPLBACKEND': 'module://ipykernel.pylab.backend_inline',
 'ENABLE_DIRECTORYPREFETCHER': '1',
 'USE_AUTH_EPHEM': '1',
 'PYDEVD_USE_FRAME_EVAL': 'NO'}

In [190]:

# print(dict(os.environ)["HOMEPATH"])
# print(dict(os.environ)["USERNAME"])

random¶

In [191]:

import random
dizi=[3,5,8,3,2,1,9]
random.seed(1)
print(random.randint(2,48)) #2 ile 48 arasında
print(random.randrange(1,100,3))
print(random.choice(dizi))
print(random.choices(dizi,k=3)) #çektiğini geri koyar, sampledan farkı bu, o yüzden aynısı çıkabilir
print(random.random()) #0-1 arası
print(random.uniform(2,5)) #2-5 arası küsurlu sayı
print(random.sample(dizi,2)) #çektiğini geri koymaz, choicestan farkı bu

10
13
8
[3, 1, 3]
0.37961522332372777
2.6298644191144316
[3, 3]

In [192]:

#liste karıştırmak
random.shuffle(dizi)
dizi

Out[192]:

[2, 5, 8, 3, 1, 3, 9]

array¶

listin hemen hemen aynısı olup önemli bir farkı, içine sadece belli tipte eleman almasıdır. Bu yüzden memory performansı açısından liste göre daha iyidir.ancak list'ler daha hızlıdır. Ayrıca sayısal bir dizi kullanacaksanız numpy kullanmanızı tavsiye ederim. Bunun dışıdna çok büyük metinsel diziler oluşturacaksanız array kullanabilirsinz.

In [193]:

from array import array
import sys
arr=array('i',[1,2,3])
lst=[1,2,3]
print(sys.getsizeof(arr),sys.getsizeof(lst)) #arr'ın memory kullanımı daha düşüktür

92 88

collections¶

In [194]:

my_list = ["a","b","c","c","e","c","b","b","a"]
my_list.count("a")

Out[194]:

In [195]:

from collections import Counter

cn = Counter(my_list) #dictionary benzeri bir yapı. tüm dictionary metodları kullanılabilir
print(cn)
print(cn["a"]) #yukarda list ile yaptığımızın aynısı

Counter({'b': 3, 'c': 3, 'a': 2, 'e': 1})
2

In [196]:

str_ = "sen seni bil sen seni sen bilmezsen kendini"
cn = Counter(str_.split(' '))
cn

Out[196]:

Counter({'sen': 3, 'seni': 2, 'bil': 1, 'bilmezsen': 1, 'kendini': 1})

In [197]:

print(cn.most_common(2)) #en çok kullanılan 2 kelime
print(list(cn)) #key değerlerinin listeye çevrilmiş hali
print(dict(cn)) #key-value değerlerinin sözlüğe çevrilmiş hali

[('sen', 3), ('seni', 2)]
['sen', 'seni', 'bil', 'bilmezsen', 'kendini']
{'sen': 3, 'seni': 2, 'bil': 1, 'bilmezsen': 1, 'kendini': 1}

In [198]:

#OrderDict'e artık ihtiyaç yoktur. Python 3.7den itibaren de dictioanryye girilen sırayı korumaktadır
#aşağıdaki örnekten de görebiliyoruz
from collections import OrderedDict
liste = ["a","c","c","a","b","a","a","b","c"]
cnt = Counter(liste)
od = OrderedDict(cnt.most_common())
d=dict(cnt.most_common())
for key, value in od.items():
    print(key, value)
print("----")
for key, value in d.items():
    print(key, value)

a 4
c 3
b 2
----
a 4
c 3
b 2

In [199]:

from collections import namedtuple

Student = namedtuple('Student', 'fname, lname, age') #class yapısına benzer bir kullanım. indeks ezberlemeye son!
s1 = Student('John', 'Clarke', '13')
print(s1.fname)

John

itertools¶

Permütasyon¶

In [200]:

from itertools import permutations #sadece tekrarsızlar yapılabiliyor. n!/(n-r)!
liste=["A","B","C"]
per1=list(permutations(liste))
per2=list(permutations(liste,2))
print(per1)
print(per2)

[('A', 'B', 'C'), ('A', 'C', 'B'), ('B', 'A', 'C'), ('B', 'C', 'A'), ('C', 'A', 'B'), ('C', 'B', 'A')]
[('A', 'B'), ('A', 'C'), ('B', 'A'), ('B', 'C'), ('C', 'A'), ('C', 'B')]

Kombinasyon¶

In [201]:

from itertools import combinations #n!/(r!*(n-r)!)
com=list(combinations(liste,2))
com

Out[201]:

[('A', 'B'), ('A', 'C'), ('B', 'C')]

Kartezyen çarpım¶

In [202]:

from itertools import product
list1=["A","B","C"]
list2=[2000,2001,2002,2003]
krt=list(product(list1,list2))
krt

Out[202]:

[('A', 2000),
 ('A', 2001),
 ('A', 2002),
 ('A', 2003),
 ('B', 2000),
 ('B', 2001),
 ('B', 2002),
 ('B', 2003),
 ('C', 2000),
 ('C', 2001),
 ('C', 2002),
 ('C', 2003)]

Regex (Düzenli ifadeler)¶

Pythona özgü değildir, hemen her dilde implementasyonu vardır. Başlı başına büyük bi konudur. Burada özet vermeye çalışıcam. İleride Text mining, NLP v.s çalışacaksanız iyi öğrenmenizde fayda var

Isınma¶

In [203]:

import re

In [204]:

a="benim adım volkan"

In [205]:

#match: başında var mı diye kontrol eder
re.match("volkan",a) #eşleşme yok, sonuç dönmez

In [206]:

re.match("ben",a) #var

Out[206]:

<re.Match object; span=(0, 3), match='ben'>

In [207]:

a.startswith("ben") #bu daha kullanışlıdır.

Out[207]:

True

In [208]:

#search, herhangi bi yerde var mı, matche göre daha yavaş çalışır
re.search("volkan",a)

Out[208]:

<re.Match object; span=(11, 17), match='volkan'>

In [209]:

"volkan" in a #bu daha pratik

Out[209]:

True

In [210]:

b="sen seni bil sen seni"
bul=re.findall("sen", b)

print(bul)
print(len(bul))
print(b.count("sen"))

['sen', 'sen', 'sen', 'sen']
4
4

Metakarakter kullanımı¶

In [211]:

isimler=["123a","ali","veli","hakan","volkan","osman","kandemir","VOLkan"]

In [212]:

#kan ile bitenler, regex olmadan
kan=[x for x in isimler if x[-3:]=="kan"]
kan

Out[212]:

['hakan', 'volkan', 'VOLkan']

[ ] Köşeli parantez¶

[] işareti, "içine giren karakterleri içeren" filtresi uygular. Burada önemli olan [] içinde gördügümüz tüm karaktereleri tek tek uyguluyor olmasıdır. Ör: [abc]: a veya b veya c içeren demek. [a-z]: A ile Z arasındakiler demek

In [213]:

#regex ile
for i in isimler:
    if re.search("[a-z]kan",i):
        print(i)
#başta veya ortada bir yerde a-z arasında bir karekter olsun, sonu kan olsun demiş olduk. yani "kan"ın önünde bir harf
#olsun da nerede olursa olsun, bşta mı ortada mı önemli değil, önemli olan kan'ın öncesinde olmaslı

hakan
volkan

In [214]:

#içinde e veya m geçenler, bu sefer list comprehension ile yapalım
[i for i in isimler if re.search("[em]",i)]

Out[214]:

['veli', 'osman', 'kandemir']

In [215]:

#rakam içerenler
liste = ["123a","b123","1234","volkan"]
sayıiçerenler=[x for x in liste if re.search("[0-9]",x)]
sayıiçerenler

Out[215]:

['123a', 'b123', '1234']

In [216]:

#rakam ile başlayanları bulma
rakamlabaşlayanlar=[x for x in liste if re.match("[0-9]",x)] #yukardakinden farklı olarak match kullandık, daha hızlı çalışır
rakamlabaşlayanlar

Out[216]:

['123a', '1234']

In [217]:

liste=["ABC","Abc","abc","12a","A12","123","Ab1","Ab23"]
print([x for x in liste if re.search("[A-Za-z]",x)]) #büyük veya küçük harf içerenler
print([x for x in liste if re.search("[A-Z][a-z]",x)]) #ilki büyük ikincisi küçük harf içerenler
print([x for x in liste if re.search("[A-Za-z0-9]",x)]) #büyük veya küçük harf veya sayı içerenler
print([x for x in liste if re.search("[A-Z][a-z][0-9]",x)]) #ilki büyük ikincisi küçük üçüncüsü sayı olanlar

['ABC', 'Abc', 'abc', '12a', 'A12', 'Ab1', 'Ab23']
['Abc', 'Ab1', 'Ab23']
['ABC', 'Abc', 'abc', '12a', 'A12', '123', 'Ab1', 'Ab23']
['Ab1', 'Ab23']

. Nokta¶

"." tek karekteri için joker anlamındadır

In [218]:

isimler=["arhan","volkan","osman","hakan","demirhan","1ozan"]

In [219]:

#5 karekterli olup an ile biten gerçek isimler(gerçek isim: sayı ile başlayan birşey isim olamaz)
liste=[x for x in isimler if re.match("[a-z]..an",x)]
liste

Out[219]:

['arhan', 'osman', 'hakan']

* Yıldız¶

Kendinden önce gelen 1 ifadeyi en az 0(evet 0, 1 değil) sayıda eşleştirir. Özellikle bir ifadenin yazılamayabildiği durumları da kapsamak için kullanılır. Aşağıdkai örnek gayet açıklayıcıdır.

In [220]:

liste = ["kıral", "kral", "kıro", "kro", "kırmızı","kırçıllı","kritik","kıritik","kııral"]
[x for x in liste if re.match("kı*r[aeıioöuü]",x)]

Out[220]:

['kıral', 'kral', 'kıro', 'kro', 'kritik', 'kıritik', 'kııral']

Bu örnekte, yabancı dilden geçen kelimelerden "kr" ile başlayanları inceledik. Bunlar bazen aralarına ı harfi girilerek yazılabiliyor. Bu kelimelerde genelde r'den sonra sesli bi harf gelir. Biz de bunları yakalamaya çalıştık.

+ Artı¶

Yıldıza benzer, ancak bu sefer en az 1 sayıda eşleşme yapar.

In [221]:

liste = ["kıral", "kral", "kıro", "kro", "kırmızı","kırçıllı","kritik","kıritik","kııral"]
[x for x in liste if re.match("kı+r[aeıioöuü]",x)]

Out[221]:

['kıral', 'kıro', 'kıritik', 'kııral']

In [222]:

#yukardaki kan'la biten isimler örneğin. bu sefer VOLkan yazdırılır
for i in isimler:
    if re.search(".+kan",i):
        print(i)

volkan
hakan

? Soru işareti¶

Kendinden önce gelen karakterin 0 veya 1 kez geçtiği durumları eşleştirir.

In [223]:

liste = ["kıral", "kral", "kıro", "kro", "kırmızı","kırçıllı","kritik","kıritik","kııral"]
[x for x in liste if re.match("kı?r[aeıioöuü]",x)]

Out[223]:

['kıral', 'kral', 'kıro', 'kro', 'kritik', 'kıritik']

{} süslü parantez¶

bir karekterin n adet geçtiği durumlar eşleştirilir.

In [224]:

liste = ["kıral", "kral", "kıro", "kro", "kırmızı","kırçıllı","kritik","kıritik","kııral"]
[x for x in liste if re.match("kı{2}r[aeıioöuü]",x)]

Out[224]:

['kııral']

In [225]:

#böyle de yapılabilirdi ama n sayısı yükseldikçe {} ile yapmak daha mantıklı
[x for x in liste if re.match("kıır[aeıioöuü]",x)]

Out[225]:

['kııral']

In [226]:

#{min,max}
liste=["gol","gool","gooool","gööl","gooooooool"]
[x for x in liste if re.match("[a-z]o{2,5}l",x)] #en az 2 en çok 5 oo içersin

Out[226]:

['gool', 'gooool']

Çeşitli linkler¶

Regex dünyası çok büyük bi dünya, burada hepsini anlatmak yerine kısa bir girizgah yapmış olduk. Öncelikle genel oalrak regexi kavramanız sonrasında da python implementasonunu kavramanız gerekiyor. Aşağıdaki linkleri incelemenizi tavisye ederim.

Genel bilgi

https://en.wikipedia.org/wiki/Regular_expression
https://regexr.com/ (pratik amaçlı)

Python

json¶

jsondan pythona¶

json yapısı python dictionary'lerine çok benzer. elde edeceğimiz nesne de bir dictionary olacaktır

In [227]:

import json

In [228]:

#örnek bir json stringden
x =  '{ "name":"John", "age":30, "city":"New York"}'
y=json.loads(x) #dcitionary olarak yükler
type(x)
type(y)
y["name"]

Out[228]:

str

Out[228]:

dict

Out[228]:

'John'

In [231]:

#veya json dosyadan. ama bu indenti dikkate almıyor, çünkü elde edilen nesne bi string değil, dictionary
import io
with io.open("/content/drive/MyDrive/Programming/PythonRocks/dataset/json/indentli_bolgesatis.json", 'r') as f:
    data = json.load(f)
type(data)
data

Out[231]:

dict

Out[231]:

{'Bolge': {'0': 'Akdeniz', '1': 'Marmara', '2': 'Akdeniz', '3': 'Marmara'},
 'Yil': {'0': 2020, '1': 2020, '2': 2021, '3': 2021},
 'Satis': {'0': 10, '1': 15, '2': 42, '3': 56}}

In [232]:

#indentli formatı yazdırmak istersek dosya okuması yaparak bi string içine okuruz
with io.open("/content/drive/MyDrive/Programming/PythonRocks/dataset/json/indentli_bolgesatis.json", mode='r') as f:
    content=f.read()    # bu stringdir, ve indentli yapı korunmuştur
print(content)

{
    "Bolge":{
        "0":"Akdeniz",
        "1":"Marmara",
        "2":"Akdeniz",
        "3":"Marmara"
    },
    "Yil":{
        "0":2020,
        "1":2020,
        "2":2021,
        "3":2021
    },
    "Satis":{
        "0":10,
        "1":15,
        "2":42,
        "3":56
    }
}

In [233]:

#split oriented kaydedilmiş dosyadan
with io.open("/content/drive/MyDrive/Programming/PythonRocks/dataset/json/indentli_bolgesatis_split.json", mode='r') as f:
    content=f.read()
print(content)

{
    "columns":[
        "Bolge",
        "Yil",
        "Satis"
    ],
    "data":[
        [
            "Akdeniz",
            2020,
            10
        ],
        [
            "Marmara",
            2020,
            15
        ],
        [
            "Akdeniz",
            2021,
            42
        ],
        [
            "Marmara",
            2021,
            56
        ]
    ]
}

In [234]:

#table oriented kaydedilmiş dosyadan
with io.open("/content/drive/MyDrive/Programming/PythonRocks/dataset/json/indentli_bolgesatis_table.json", mode='r') as f:
    content=f.read()
print(content)

{
    "schema":{
        "fields":[
            {
                "name":"Bolge",
                "type":"string"
            },
            {
                "name":"Yil",
                "type":"integer"
            },
            {
                "name":"Satis",
                "type":"integer"
            }
        ],
        "pandas_version":"0.20.0"
    },
    "data":[
        {
            "Bolge":"Akdeniz",
            "Yil":2020,
            "Satis":10
        },
        {
            "Bolge":"Marmara",
            "Yil":2020,
            "Satis":15
        },
        {
            "Bolge":"Akdeniz",
            "Yil":2021,
            "Satis":42
        },
        {
            "Bolge":"Marmara",
            "Yil":2021,
            "Satis":56
        }
    ]
}

In [235]:

c=json.loads(content)
type(c)
c

Out[235]:

dict

Out[235]:

{'schema': {'fields': [{'name': 'Bolge', 'type': 'string'},
   {'name': 'Yil', 'type': 'integer'},
   {'name': 'Satis', 'type': 'integer'}],
  'pandas_version': '0.20.0'},
 'data': [{'Bolge': 'Akdeniz', 'Yil': 2020, 'Satis': 10},
  {'Bolge': 'Marmara', 'Yil': 2020, 'Satis': 15},
  {'Bolge': 'Akdeniz', 'Yil': 2021, 'Satis': 42},
  {'Bolge': 'Marmara', 'Yil': 2021, 'Satis': 56}]}

In [236]:

#içteki tek bir bilgiye ulaşma
c["data"][0]["Bolge"] #dict of list of dict

Out[236]:

'Akdeniz'

In [237]:

#tüm bölgeleri alma
for l in c["data"]:
    print(l["Bolge"])

Akdeniz
Marmara
Akdeniz
Marmara

pythondan jsona¶

In [238]:

x = {
  "name": "John",
  "age": 30,
  "city": "New York"
}

j = json.dumps(x)
type(x)
type(j)
j

Out[238]:

dict

Out[238]:

str

Out[238]:

'{"name": "John", "age": 30, "city": "New York"}'

In [239]:

#komplike(nested) json
x = {
  "name": "John",
  "age": 30,
  "married": True,
  "divorced": False,
  "children": ("Ann","Billy"),
  "pets": None,
  "cars": [
    {"model": "BMW 230", "mpg": 27.5},
    {"model": "Ford Edge", "mpg": 24.1}
  ]
}

print(json.dumps(x))

{"name": "John", "age": 30, "married": true, "divorced": false, "children": ["Ann", "Billy"], "pets": null, "cars": [{"model": "BMW 230", "mpg": 27.5}, {"model": "Ford Edge", "mpg": 24.1}]}

In [240]:

#bu da şık hali
print(json.dumps(x,indent=4))

{
    "name": "John",
    "age": 30,
    "married": true,
    "divorced": false,
    "children": [
        "Ann",
        "Billy"
    ],
    "pets": null,
    "cars": [
        {
            "model": "BMW 230",
            "mpg": 27.5
        },
        {
            "model": "Ford Edge",
            "mpg": 24.1
        }
    ]
}

In [241]:

print(json.dumps(x,indent=4,sort_keys=True))

{
    "age": 30,
    "cars": [
        {
            "model": "BMW 230",
            "mpg": 27.5
        },
        {
            "model": "Ford Edge",
            "mpg": 24.1
        }
    ],
    "children": [
        "Ann",
        "Billy"
    ],
    "divorced": false,
    "married": true,
    "name": "John",
    "pets": null
}

Bunları aşağıdaki I/O işlemleriyle bir dosyaya da yazdırabilirsiniz.

Request¶

https://realpython.com/python-requests/ sitesinden faydalandım

Basics¶

In [242]:

import requests

In [243]:

httpget0=r"https://www.excelinefendisi.com/httpapiservice/ResponseveRequestTarget.aspx" #buna parametre verebiliyoruz, Anakonu=VBAMakro, Altkonu=Temeller
httpget0j=r"https://www.excelinefendisi.com/httpapiservice/ReturnJson.aspx" #tüm duyurlar as json
httpget3=r"https://httpbin.org/get" #as json
httpget4img=r"https://httpbin.org/image"
httpget5=r"https://www.google.com/search?q=excel&oq=excel&aqs=chrome..69i57j35i39l2j0i433l4j46i433l2j0.839j0j15&sourceid=chrome&ie=UTF-8"
httpget6githubjson=r"https://raw.githubusercontent.com/VolkiTheDreamer/dataset/master/json/bolgesatis.json"

httppost3=r"https://httpbin.org/post"

In [244]:

r=requests.get(httpget3)

In [245]:

r.headers

Out[245]:

{'Date': 'Sun, 22 Sep 2024 15:26:54 GMT', 'Content-Type': 'application/json', 'Content-Length': '307', 'Connection': 'keep-alive', 'Server': 'gunicorn/19.9.0', 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Credentials': 'true'}

In [246]:

requests.get('http://volkanyurtseven.com')

Out[246]:

<Response [200]>

In [247]:

try:
  requests.get('https://volkanyurtseven.com') #https versiyonu yok, hata alcaz
  print("site çalışıyor")
except:
  print("site çalışmıyor")

site çalışmıyor

In [248]:

requests.get('http://volkanyurtseven.com/olmayansayfa') #bu sefer site doğru ama bahsekonu sayfa yoksa 404

Out[248]:

<Response [404]>

In [249]:

response = requests.get('http://volkanyurtseven.com')
response.status_code

Out[249]:

In [250]:

response = requests.get(httpget3)
response.content #byte olarak

Out[250]:

b'{\n  "args": {}, \n  "headers": {\n    "Accept": "*/*", \n    "Accept-Encoding": "gzip, deflate", \n    "Host": "httpbin.org", \n    "User-Agent": "python-requests/2.32.3", \n    "X-Amzn-Trace-Id": "Root=1-66f03742-6923173002866a3c50f6dfc5"\n  }, \n  "origin": "34.16.168.234", \n  "url": "https://httpbin.org/get"\n}\n'

In [251]:

response.text #string olarak

Out[251]:

'{\n  "args": {}, \n  "headers": {\n    "Accept": "*/*", \n    "Accept-Encoding": "gzip, deflate", \n    "Host": "httpbin.org", \n    "User-Agent": "python-requests/2.32.3", \n    "X-Amzn-Trace-Id": "Root=1-66f03742-6923173002866a3c50f6dfc5"\n  }, \n  "origin": "34.16.168.234", \n  "url": "https://httpbin.org/get"\n}\n'

In [252]:

response.raw

Out[252]:

<urllib3.response.HTTPResponse at 0x7d45a637c1f0>

Böyle çok karışık oldu, bunu json olarak okuyalım. ama tabi önce bunun son versiyonuna tekrar bi get atalım.

In [253]:

#hala biraz okunaklı değil gibi
response = requests.get(httpget3)
response.text

Out[253]:

'{\n  "args": {}, \n  "headers": {\n    "Accept": "*/*", \n    "Accept-Encoding": "gzip, deflate", \n    "Host": "httpbin.org", \n    "User-Agent": "python-requests/2.32.3", \n    "X-Amzn-Trace-Id": "Root=1-66f03743-514f57731ffeb7ce34129485"\n  }, \n  "origin": "34.16.168.234", \n  "url": "https://httpbin.org/get"\n}\n'

In [254]:

json.loads(response.text)

Out[254]:

{'args': {},
 'headers': {'Accept': '*/*',
  'Accept-Encoding': 'gzip, deflate',
  'Host': 'httpbin.org',
  'User-Agent': 'python-requests/2.32.3',
  'X-Amzn-Trace-Id': 'Root=1-66f03743-514f57731ffeb7ce34129485'},
 'origin': '34.16.168.234',
 'url': 'https://httpbin.org/get'}

In [255]:

#veya responseun json metopdunu kullanabiliriz
response.json()

Out[255]:

{'args': {},
 'headers': {'Accept': '*/*',
  'Accept-Encoding': 'gzip, deflate',
  'Host': 'httpbin.org',
  'User-Agent': 'python-requests/2.32.3',
  'X-Amzn-Trace-Id': 'Root=1-66f03743-514f57731ffeb7ce34129485'},
 'origin': '34.16.168.234',
 'url': 'https://httpbin.org/get'}

It should be noted that the success of the call to r.json() does not indicate the success of the response. Some servers may return a JSON object in a failed response (e.g. error details with HTTP 500). Such JSON will be decoded and returned. To check that a request is successful, use r.raise_for_status() or check r.status_code is what you expect.

In [256]:

response.raise_for_status()
response.status_code

Out[256]:

In [257]:

dict(response.headers)

Out[257]:

{'Date': 'Sun, 22 Sep 2024 15:26:59 GMT',
 'Content-Type': 'application/json',
 'Content-Length': '307',
 'Connection': 'keep-alive',
 'Server': 'gunicorn/19.9.0',
 'Access-Control-Allow-Origin': '*',
 'Access-Control-Allow-Credentials': 'true'}

In [258]:

response.headers['Content-Type']

Out[258]:

'application/json'

QueryString parameters¶

In [259]:

# Search GitHub's repositories for requests
response = requests.get(
    'https://api.github.com/search/repositories',
    params={'q': 'requests+language:python'},
)

# Inspect some attributes of the `requests` repository
json_response = response.json()
repository = json_response['items'][0]
print(f'Repository name: {repository["name"]}')  # Python 3.6+
print(f'Repository description: {repository["description"]}')  # Python 3.6+
print(response.url)

Repository name: secrules-language-evaluation
Repository description: Set of Python scripts to perform SecRules language evaluation on a given http request.
https://api.github.com/search/repositories?q=requests%2Blanguage%3Apython

In [260]:

response = requests.get(
    'https://www.excelinefendisi.com/httpapiservice/ResponseveRequestTarget.aspx',
    params={'Anakonu':'VBAMakro', 'Altkonu':'Temeller'},
)

print(response.url)

https://www.excelinefendisi.com/httpapiservice/ResponseveRequestTarget.aspx?Anakonu=VBAMakro&Altkonu=Temeller

In [261]:

response.text

Out[261]:

'\r\n\r\n<!DOCTYPE html>\r\n\r\n<html xmlns="http://www.w3.org/1999/xhtml">\r\n<head><title>\r\n\r\n</title></head>\r\n<body>\r\n    <form method="post" action="./ResponseveRequestTarget.aspx?Anakonu=VBAMakro&amp;Altkonu=Temeller" id="form1">\r\n<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUKMTc4MTU0MzE5Ng9kFgICAw9kFgQCAQ8PFgIeBFRleHQFSFZCQU1ha3JvIGFuYWtvdXN1IHZlIFRlbWVsbGVyICBhbHRrb251c3UgYWx0xLFuZGEgdG9wbGFtIDUgYWRldCBrb251IHZhcmRkAgMPPCsAEQEMFCsAAGQYAQUJR3JpZFZpZXcxD2dkSbXqjXArXyEF5jtUymfpoO6KaTfD1XAK8zOQxOrGCZk=" />\r\n\r\n<input type="hidden" name="__VIEWSTATEGENERATOR" id="__VIEWSTATEGENERATOR" value="BD530E7E" />\r\n        <div>\r\n            <h2>Genel bilgier</h2>\r\n            <span id="lblSonuc">VBAMakro anakousu ve Temeller  altkonusu altında toplam 5 adet konu var</span>  <br /><br />\r\n        </div>\r\n        <div>\r\n            <h2>Data bölgesi</h2>\r\n            <div>\r\n\r\n</div>\r\n        </div>\r\n    </form>\r\n</body>\r\n</html>\r\n'

In [262]:

payload = {'key1': 'value1', 'key2': 'value2'}
r = requests.get('https://httpbin.org/get', params=payload)

Headers¶

In [263]:

r.headers

Out[263]:

{'Date': 'Sun, 22 Sep 2024 15:27:02 GMT', 'Content-Type': 'application/json', 'Content-Length': '378', 'Connection': 'keep-alive', 'Server': 'gunicorn/19.9.0', 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Credentials': 'true'}

In [264]:

#custimize header
response = requests.get(
    'https://api.github.com/search/repositories',
    params={'q': 'requests+language:python'},
    headers={'Accept': 'application/vnd.github.v3.text-match+json'},
)

# View the new `text-matches` array which provides information
# about your search term within the results
json_response = response.json()
repository = json_response['items'][0]
print(f'Text matches: {repository["text_matches"]}')

Text matches: [{'object_url': 'https://api.github.com/repositories/33210074', 'object_type': 'Repository', 'property': 'description', 'fragment': 'Set of Python scripts to perform SecRules language evaluation on a given http request.', 'matches': [{'text': 'Python', 'indices': [7, 13]}, {'text': 'language', 'indices': [42, 50]}, {'text': 'request', 'indices': [78, 85]}]}]

In [265]:

requests.post('https://httpbin.org/post', data={'key':'value'})
requests.put('https://httpbin.org/put', data={'key':'value'})
requests.delete('https://httpbin.org/delete')
requests.head('https://httpbin.org/get')
requests.patch('https://httpbin.org/patch', data={'key':'value'})
requests.options('https://httpbin.org/get')

Out[265]:

<Response [200]>

Out[265]:

<Response [200]>

Out[265]:

<Response [200]>

Out[265]:

<Response [200]>

Out[265]:

<Response [200]>

Out[265]:

<Response [200]>

In [266]:

response = requests.head('https://httpbin.org/get')
response.headers['Content-Type']

Out[266]:

'application/json'

In [267]:

response = requests.delete('https://httpbin.org/delete')
json_response = response.json()
json_response['args']

Out[267]:

{}

In [267]:

According to the HTTP specification, POST, PUT, and the less common PATCH requests pass their data through the message body rather than through parameters in the query string.

In [268]:

requests.post('https://httpbin.org/post', data={'key':'value'}) #veya data=[('key', 'value')]

Out[268]:

<Response [200]>

In [269]:

response = requests.post('https://httpbin.org/post', json={'key':'value'})
json_response = response.json()
json_response['data']
json_response['headers']['Content-Type']

Out[269]:

'{"key": "value"}'

Out[269]:

'application/json'

In [270]:

response = requests.post('https://httpbin.org/post', json={'key':'value'})
response.request.headers['Content-Type']
response.request.url
response.request.body

Out[270]:

'application/json'

Out[270]:

'https://httpbin.org/post'

Out[270]:

b'{"key": "value"}'

In [271]:

response.url
response.headers
response.request.url
response.request.headers

Out[271]:

'https://httpbin.org/post'

Out[271]:

{'Date': 'Sun, 22 Sep 2024 15:27:06 GMT', 'Content-Type': 'application/json', 'Content-Length': '481', 'Connection': 'keep-alive', 'Server': 'gunicorn/19.9.0', 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Credentials': 'true'}

Out[271]:

'https://httpbin.org/post'

Out[271]:

{'User-Agent': 'python-requests/2.32.3', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Content-Length': '16', 'Content-Type': 'application/json'}

In [272]:

#authentication
from getpass import getpass
requests.get('https://api.github.com/user', auth=('username', getpass()))

··········

Out[272]:

<Response [401]>

webserive¶

BeautifulSoup¶

https://realpython.com/beautiful-soup-web-scraper-python/

In [273]:

URL = 'https://www.monster.com/jobs/search/?q=Software-Developer&where=Australia'
page = requests.get(URL)

In [274]:

import requests
from bs4 import BeautifulSoup

URL = 'https://www.excelinefendisi.com/Konular/Excel/Giris_PratikKisayollar.aspx'
page = requests.get(URL)

soup = BeautifulSoup(page.content, 'html.parser')

In [275]:

results = soup.find_all(class_='alterantelitable')
for r in results:
    print(r, end='\n'*2)

<table class="alterantelitable">
<th>Amaç</th>
<th>Kısayol</th>
<tr>
<td>Sayfalar arasında dolaşmak</td>
<td>CTRL + PgUp/PgDn</td>
</tr>
<tr>
<td>Bugünün Tarihini yazmak</td>
<td>CTRL + SHIFT +, </td>
</tr>
<tr>
<td>Tüm açık dosyalarda calculation yapmak</td>
<td>F9</td>
</tr>
<tr>
<td>Seçili kısmın değerini hesaplayıp göstermek</td>
<td>Hücre içindeki formül seçilip F9</td>
</tr>
<tr>
<td>Aktif sayfada calculation yapmak</td>
<td>SHIFT+F9</td>
</tr>
<tr>
<td>Sadece belli range için calculation yapmak</td>
<td>VBA ile yapılır. <a href="/Konular/VBAMakro/DortTemelNesne_Range.aspx#Calculation">Burdan </a>bakın.</td>
</tr>
<tr>
<td>Bulunduğun hücrenin <abbr title="Bulunulan hücrenin etrafındaki tüm dolu alandır">CurrentRegion</abbr>'ını seçme</td>
<td>CTRL+ A</td>
</tr>
<tr>
<td>Bulunduğun hücreden <abbr title="Bulunulan hücrenin etrafındaki tüm dolu alandır">CurrentRegion</abbr>'ın uç noktlarına gitmek</td>
<td>CTRL+ Ok tuşları</td>
</tr>
<tr>
<td>Bulunduğun hücreden itibaren belli bir yöne doğru seçim yapmak</td>
<td>SHIFT+Ok tuşları</td>
</tr>
<tr><td>Bulunduğun hücreden itibaren <abbr title="Bulunulan hücrenin etrafındaki tüm dolu alandır">CurrentRegion</abbr> bir ucuna doğru toplu seçim yapmak</td>
<td>CTRL+SHIFT+Ok tuşları</td>
</tr>
<tr>
<td>Bulunduğun hücreden <abbr title="Bulunulan hücrenin etrafındaki tüm dolu alandır">CurrentRegion</abbr>'ın Sağ Aşağı uç noktlasına gitmek</td>
<td>CTRL+END</td>
</tr>
<tr>
<td>Bulunduğun hücreden <abbr title="Bulunulan hücrenin etrafındaki tüm dolu alandır">CurrentRegion</abbr>'ın Sağ Aşağı uç noktlasına kadar seçmek</td>
<td>CTRL+SHIFT+END</td>
</tr>
<tr>
<td>Bulunduğun hücreden  A1 hücresine kadar olan alanı(sol yukarı) seçmek</td>
<td>CTRL+SHIFT+HOME</td>
</tr>
<tr>
<td>Bir hücre içinde veri girerken, aynı hücre içinde yeni bir satır açıp oradan devam etmek</td>
<td>ALT+ENTER</td>
</tr>
<tr>
<td>Veri/Formül girişi yaptığınız hücrede alt hücreye geçmeden giriş tamamlamak </td>
<td>CTRL+ENTER</td>
</tr>
<tr>
<td> Ekranda bir sayfa sağa kaymak.</td>
<td>ALT+PGE DOWN</td>
</tr>
<tr>
<td>AutoFilter'ı aktif/pasif hale getirmek</td>
<td>CTRL+SHIFT+L</td>
</tr>
<tr>
<td>Bulunduğunuz hücrenin satır ve sütununa aynı anda freeze uygulamak/kaldırmak</td>
<td>Alt+W+FF</td>
</tr>
<tr>
<td>VBA editörünü açmak</td>
<td>Alt+F11</td>
</tr>
<tr>
<td>Ribbonu küçültüp/büyütmek</td>
<td>CTRL+F1</td>
</tr>
<tr>
<td>Üst hücrelerdeki tüm rakamların toplamını almak</td>
<td>ALT+=</td>
</tr>
<tr>
<td>Flash Fill uygulamak</td>
<td>CTRL+E</td>
</tr>
<tr>
<td>Sadece görünen hücreleri seçmek</td>
<td>ALT+;</td>
</tr>
</table>

Gereksiz elementlerden ve taglerden kurtulalım

In [276]:

results = soup.find_all(class_='alterantelitable')
for r in results:
    tds=r.find_all("td")
    for td in tds:
        print(td.text)

Sayfalar arasında dolaşmak
CTRL + PgUp/PgDn
Bugünün Tarihini yazmak
CTRL + SHIFT +, 
Tüm açık dosyalarda calculation yapmak
F9
Seçili kısmın değerini hesaplayıp göstermek
Hücre içindeki formül seçilip F9
Aktif sayfada calculation yapmak
SHIFT+F9
Sadece belli range için calculation yapmak
VBA ile yapılır. Burdan bakın.
Bulunduğun hücrenin CurrentRegion'ını seçme
CTRL+ A
Bulunduğun hücreden CurrentRegion'ın uç noktlarına gitmek
CTRL+ Ok tuşları
Bulunduğun hücreden itibaren belli bir yöne doğru seçim yapmak
SHIFT+Ok tuşları
Bulunduğun hücreden itibaren CurrentRegion bir ucuna doğru toplu seçim yapmak
CTRL+SHIFT+Ok tuşları
Bulunduğun hücreden CurrentRegion'ın Sağ Aşağı uç noktlasına gitmek
CTRL+END
Bulunduğun hücreden CurrentRegion'ın Sağ Aşağı uç noktlasına kadar seçmek
CTRL+SHIFT+END
Bulunduğun hücreden  A1 hücresine kadar olan alanı(sol yukarı) seçmek
CTRL+SHIFT+HOME
Bir hücre içinde veri girerken, aynı hücre içinde yeni bir satır açıp oradan devam etmek
ALT+ENTER
Veri/Formül girişi yaptığınız hücrede alt hücreye geçmeden giriş tamamlamak 
CTRL+ENTER
 Ekranda bir sayfa sağa kaymak.
ALT+PGE DOWN
AutoFilter'ı aktif/pasif hale getirmek
CTRL+SHIFT+L
Bulunduğunuz hücrenin satır ve sütununa aynı anda freeze uygulamak/kaldırmak
Alt+W+FF
VBA editörünü açmak
Alt+F11
Ribbonu küçültüp/büyütmek
CTRL+F1
Üst hücrelerdeki tüm rakamların toplamını almak
ALT+=
Flash Fill uygulamak
CTRL+E
Sadece görünen hücreleri seçmek
ALT+;

İşlemle ilgili kısayolu altalta değil de yanyana yazmasını sağlayalım,

In [277]:

results = soup.find_all(class_='alterantelitable')
for r in results:
    trs=r.find_all("tr")
    for tr in trs:
        td1=tr.select("td")[0] #tr.find("td") de olurdu ama aşağıdakiyle bütünlük olması adına ikisini de select ile yaptık
        td2=tr.select("td")[1]
        print(td1.text,":",td2.text)

Sayfalar arasında dolaşmak : CTRL + PgUp/PgDn
Bugünün Tarihini yazmak : CTRL + SHIFT +, 
Tüm açık dosyalarda calculation yapmak : F9
Seçili kısmın değerini hesaplayıp göstermek : Hücre içindeki formül seçilip F9
Aktif sayfada calculation yapmak : SHIFT+F9
Sadece belli range için calculation yapmak : VBA ile yapılır. Burdan bakın.
Bulunduğun hücrenin CurrentRegion'ını seçme : CTRL+ A
Bulunduğun hücreden CurrentRegion'ın uç noktlarına gitmek : CTRL+ Ok tuşları
Bulunduğun hücreden itibaren belli bir yöne doğru seçim yapmak : SHIFT+Ok tuşları
Bulunduğun hücreden itibaren CurrentRegion bir ucuna doğru toplu seçim yapmak : CTRL+SHIFT+Ok tuşları
Bulunduğun hücreden CurrentRegion'ın Sağ Aşağı uç noktlasına gitmek : CTRL+END
Bulunduğun hücreden CurrentRegion'ın Sağ Aşağı uç noktlasına kadar seçmek : CTRL+SHIFT+END
Bulunduğun hücreden  A1 hücresine kadar olan alanı(sol yukarı) seçmek : CTRL+SHIFT+HOME
Bir hücre içinde veri girerken, aynı hücre içinde yeni bir satır açıp oradan devam etmek : ALT+ENTER
Veri/Formül girişi yaptığınız hücrede alt hücreye geçmeden giriş tamamlamak  : CTRL+ENTER
 Ekranda bir sayfa sağa kaymak. : ALT+PGE DOWN
AutoFilter'ı aktif/pasif hale getirmek : CTRL+SHIFT+L
Bulunduğunuz hücrenin satır ve sütununa aynı anda freeze uygulamak/kaldırmak : Alt+W+FF
VBA editörünü açmak : Alt+F11
Ribbonu küçültüp/büyütmek : CTRL+F1
Üst hücrelerdeki tüm rakamların toplamını almak : ALT+=
Flash Fill uygulamak : CTRL+E
Sadece görünen hücreleri seçmek : ALT+;

Bunun bir de MechanicalSoup versiyonu var, onda websitelerindeki formları da otomatik doldurma işlemi yaptırabiliyorsunuz.

Logging¶

Programınızı test ederken print değil bunu kullanmanız önerilir.

In [278]:

import logging

logging.debug('This is a debug message')
logging.info('This is an info message')
#by default, the logging module logs the messages with a severity level of WARNING or above
logging.warning('This is a warning message')
logging.error('This is an error message')
logging.critical('This is a critical message')

WARNING:root:This is a warning message
ERROR:root:This is an error message
CRITICAL:root:This is a critical message

root= default logger

In [279]:

logging.basicConfig(level=logging.DEBUG,force=True) #son parametre gerekli. Detay için: https://stackoverflow.com/questions/30861524/logging-basicconfig-not-creating-log-file-when-i-run-in-pycharm/42210221
logging.debug('This will get logged')

DEBUG:root:This will get logged

In [280]:

logging.basicConfig(filename='app.log', filemode='w', format='%(name)s - %(levelname)s - %(message)s',force=True)
logging.warning('This will get logged to a file')

In [281]:

logging.basicConfig(format='%(name)s - %(levelname)s - %(asctime)s - %(message)s', level=logging.INFO,force=True)
logging.info('Admin logged in')

root - INFO - 2024-09-22 15:27:21,768 - Admin logged in

In [282]:

name = 'John'
logging.error(f'{name} raised an error')

root - ERROR - 2024-09-22 15:27:21,801 - John raised an error

In [283]:

a = 5
b = 0

try:
    c = a / b
except Exception as e:
    logging.error("Exception occurred", exc_info=True)

root - ERROR - 2024-09-22 15:27:21,848 - Exception occurred
Traceback (most recent call last):
  File "<ipython-input-283-ed4209066cce>", line 5, in <cell line: 4>
    c = a / b
ZeroDivisionError: division by zero

tqdm¶

Döngüsel işlemlerde progressbar sağlar

In [284]:

from tqdm import tqdm, trange
from time import sleep

for i in tqdm(range(10)):
    sleep(.1)

100%|██████████| 10/10 [00:01<00:00,  9.61it/s]

In [285]:

# Simple loop
for i in range(100):
    pass

# Loop with a progress bar
for i in trange(100):
    time.sleep(0.01)

100%|██████████| 100/100 [00:01<00:00, 96.83it/s]

In [286]:

#notebook versiyonları daha canlı
from tqdm.notebook import trange, tqdm
from time import sleep

for i in trange(3, desc='1st loop'):
    for j in tqdm(range(100), desc='2nd loop'):
        sleep(0.01)

1st loop:   0%|          | 0/3 [00:00<?, ?it/s]

2nd loop:   0%|          | 0/100 [00:00<?, ?it/s]

2nd loop:   0%|          | 0/100 [00:00<?, ?it/s]

2nd loop:   0%|          | 0/100 [00:00<?, ?it/s]

Eğer progresbar görünmüyorsa mıuhtemelen nbextensionstaki bi widget üyüzndendir, şuraya bakın : https://stackoverflow.com/questions/57343134/jupyter-notebooks-not-displaying-progress-bars

In [287]:

# !jupyter nbextension enable --py widgetsnbextension

In [288]:

from tqdm import tqdm_notebook
from tqdm.notebook import trange
from time import sleep

for i in trange(4, desc='1st loop'):
    for j in trange(100, desc='2nd loop'):
        sleep(0.01)

1st loop:   0%|          | 0/4 [00:00<?, ?it/s]

2nd loop:   0%|          | 0/100 [00:00<?, ?it/s]

2nd loop:   0%|          | 0/100 [00:00<?, ?it/s]

2nd loop:   0%|          | 0/100 [00:00<?, ?it/s]

2nd loop:   0%|          | 0/100 [00:00<?, ?it/s]

In [289]:

from tqdm.notebook import tqdm_notebook
import time

for i in tqdm_notebook(range(10)):
    time.sleep(0.5)

  0%|          | 0/10 [00:00<?, ?it/s]

looplarda tqdm¶

In [290]:

from tqdm import tqdm
for i in tqdm(range(2), desc = 'Loop 1'):
    for j in tqdm(range(20,25), desc = 'Loop 2'):
        time.sleep(0.5)

Loop 1:   0%|          | 0/2 [00:00<?, ?it/s]
Loop 2:   0%|          | 0/5 [00:00<?, ?it/s]
Loop 2:  20%|██        | 1/5 [00:00<00:02,  2.00it/s]
Loop 2:  40%|████      | 2/5 [00:01<00:01,  1.99it/s]
Loop 2:  60%|██████    | 3/5 [00:01<00:01,  1.98it/s]
Loop 2:  80%|████████  | 4/5 [00:02<00:00,  1.98it/s]
Loop 2: 100%|██████████| 5/5 [00:02<00:00,  1.98it/s]
Loop 1:  50%|█████     | 1/2 [00:02<00:02,  2.53s/it]
Loop 2:   0%|          | 0/5 [00:00<?, ?it/s]
Loop 2:  20%|██        | 1/5 [00:00<00:02,  2.00it/s]
Loop 2:  40%|████      | 2/5 [00:01<00:01,  1.99it/s]
Loop 2:  60%|██████    | 3/5 [00:01<00:01,  1.99it/s]
Loop 2:  80%|████████  | 4/5 [00:02<00:00,  1.99it/s]
Loop 2: 100%|██████████| 5/5 [00:02<00:00,  1.98it/s]
Loop 1: 100%|██████████| 2/2 [00:05<00:00,  2.54s/it]

In [291]:

from tqdm.notebook import tqdm_notebook
for i in tqdm_notebook(range(2), desc = 'Loop 1'):
    for j in tqdm_notebook(range(20,25), desc = 'Loop 2'):
        time.sleep(0.5)

Loop 1:   0%|          | 0/2 [00:00<?, ?it/s]

Loop 2:   0%|          | 0/5 [00:00<?, ?it/s]

Loop 2:   0%|          | 0/5 [00:00<?, ?it/s]

I/O (Dosya okuma yazma) işlemleri¶

Okuma¶

dosya = open(dosya_adı, kip)

In [292]:

os.chdir("/content/drive/MyDrive/Programming/PythonRocks/mypyext")

In [293]:

import io
dp = io.open("pythonutility.py", "r")

In [294]:

#oku
#import io #normalde bu satıra gerek yok, open=io.open için bi alias
dosya = io.open("pythonutility.py", "r") #yine bunda da başta io yazmazdık normalde ama os'nin open'ından ayırmak için ekledik
print(dosya.readline(1))
print("----")
print(dosya.read()) #ilk satırı okuduğumuz için ikinci satırdan okumaya devam ediyor
print("----")
dosya.seek(0) #başa konumlanalım tekrar
print(dosya.readline(2)) #baştan ilk 2 karakter

f
----
rom __future__ import print_function
import inspect
import os, sys, site
import functools
import time
from forbiddenfruit import curse

try:
    import __builtin__
except ImportError:
    import builtins as __builtin__

# *************************************************************************************************************
#Module level methods

    
def lineno():
    previous_frame = inspect.currentframe().f_back.f_back
    (filename, line_number, function_name, lines, index) = inspect.getframeinfo(previous_frame)
    return (line_number, lines)
    #return inspect.currentframe().f_back.f_back.f_lineno, str(inspect.currentframe().f_back)

def printy(*args, **kwargs):
    print(lineno(),"\n----------")
    print(*args, **kwargs)
    print(" ",end="\n")


def timeElapse(func):
    """
        usage:
        @timeElapse
        def somefunc():
            ...
            ...

        somefunc()
    """
    @functools.wraps(func)
    def wrapper(*args,**kwargs):
        start=time.time()
        value=func(*args,**kwargs)
        func()
        finito=time.time()
        print("Time elapsed:{}".format(finito-start))
        return value
    return wrapper    


def multioutput(type="all"):
    from IPython.core.interactiveshell import InteractiveShell
    InteractiveShell.ast_node_interactivity = type
    
def scriptforReload():
    print("""
    %load_ext autoreload
    %autoreload 2""")
   
def scriptforTraintest():
    print("X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.25,random_state=42)")
    
def scriptForCitation():
    print("""<p style="font-size:smaller;text-align:center">Görsel <a href="url">bu sayfadan</a> alınmıştır</p>""")
    
def pythonSomeInfo():
    print("system packages folder:",sys.prefix, end="\n\n")
    print("pip install folder:",site.getsitepackages(), end="\n\n")    
    print("python version:", sys.version, end="\n\n")
    print("executables location:",sys.executable, end="\n\n")
    print("pip version:", os.popen('pip version').read(), end="\n\n")
    pathes= sys.path
    print("Python pathes")
    for p in pathes:
        print(p)


def showMemoryUsage():
    dict_={}
    global_vars = list(globals().items())
    for var, obj in global_vars:
        if not var.startswith('_'):
            dict_[var]=sys.getsizeof(obj)
            
    final={k: v for k, v in sorted(dict_.items(), key=lambda item: item[1],reverse=True)}    
    print(final)
    
def readfile(path,enc='cp1254'):
    with io.open(path, "r", encoding=enc) as f:
        return f.read()

def getFirstItemFromDictionary(dict_):
    return next(iter(dict_)),next(iter(dict_.values()))
        


def removeItemsFromList(self,list2,inplace=True):    
    """
        Extension method for list type. Removes items from list2 from list1.
        First, forbiddenfruit must be installed via https://pypi.org/project/forbiddenfruit/
    """    
    if inplace:
        for x in set(list2):
            self.remove(x)
        return self
    else:
        temp=self.copy()
        for x in set(list2):
            temp.remove(x)
        return temp
    
curse(list, "removeItemsFromList", removeItemsFromList)






----

Out[294]:

fr

In [295]:

#her satırın başına satır no ekleyelim
dosya.seek(0)
i=1
for satır in dosya.readlines():
    print("{}-{}".format(i,satır),end="")
    i+=1

Out[295]:

1-from __future__ import print_function
2-import inspect
3-import os, sys, site
4-import functools
5-import time
6-from forbiddenfruit import curse
7-
8-try:
9-    import __builtin__
10-except ImportError:
11-    import builtins as __builtin__
12-
13-# *************************************************************************************************************
14-#Module level methods
15-
16-    
17-def lineno():
18-    previous_frame = inspect.currentframe().f_back.f_back
19-    (filename, line_number, function_name, lines, index) = inspect.getframeinfo(previous_frame)
20-    return (line_number, lines)
21-    #return inspect.currentframe().f_back.f_back.f_lineno, str(inspect.currentframe().f_back)
22-
23-def printy(*args, **kwargs):
24-    print(lineno(),"\n----------")
25-    print(*args, **kwargs)
26-    print(" ",end="\n")
27-
28-
29-def timeElapse(func):
30-    """
31-        usage:
32-        @timeElapse
33-        def somefunc():
34-            ...
35-            ...
36-
37-        somefunc()
38-    """
39-    @functools.wraps(func)
40-    def wrapper(*args,**kwargs):
41-        start=time.time()
42-        value=func(*args,**kwargs)
43-        func()
44-        finito=time.time()
45-        print("Time elapsed:{}".format(finito-start))
46-        return value
47-    return wrapper    
48-
49-
50-def multioutput(type="all"):
51-    from IPython.core.interactiveshell import InteractiveShell
52-    InteractiveShell.ast_node_interactivity = type
53-    
54-def scriptforReload():
55-    print("""
56-    %load_ext autoreload
57-    %autoreload 2""")
58-   
59-def scriptforTraintest():
60-    print("X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.25,random_state=42)")
61-    
62-def scriptForCitation():
63-    print("""<p style="font-size:smaller;text-align:center">Görsel <a href="url">bu sayfadan</a> alınmıştır</p>""")
64-    
65-def pythonSomeInfo():
66-    print("system packages folder:",sys.prefix, end="\n\n")
67-    print("pip install folder:",site.getsitepackages(), end="\n\n")    
68-    print("python version:", sys.version, end="\n\n")
69-    print("executables location:",sys.executable, end="\n\n")
70-    print("pip version:", os.popen('pip version').read(), end="\n\n")
71-    pathes= sys.path
72-    print("Python pathes")
73-    for p in pathes:
74-        print(p)
75-
76-
77-def showMemoryUsage():
78-    dict_={}
79-    global_vars = list(globals().items())
80-    for var, obj in global_vars:
81-        if not var.startswith('_'):
82-            dict_[var]=sys.getsizeof(obj)
83-            
84-    final={k: v for k, v in sorted(dict_.items(), key=lambda item: item[1],reverse=True)}    
85-    print(final)
86-    
87-def readfile(path,enc='cp1254'):
88-    with io.open(path, "r", encoding=enc) as f:
89-        return f.read()
90-
91-def getFirstItemFromDictionary(dict_):
92-    return next(iter(dict_)),next(iter(dict_.values()))
93-        
94-
95-
96-def removeItemsFromList(self,list2,inplace=True):    
97-    """
98-        Extension method for list type. Removes items from list2 from list1.
99-        First, forbiddenfruit must be installed via https://pypi.org/project/forbiddenfruit/
100-    """    
101-    if inplace:
102-        for x in set(list2):
103-            self.remove(x)
104-        return self
105-    else:
106-        temp=self.copy()
107-        for x in set(list2):
108-            temp.remove(x)
109-        return temp
110-    
111-curse(list, "removeItemsFromList", removeItemsFromList)
112-
113-
114-
115-
116-

In [296]:

#yarat
yenidosya=io.open("test.txt","w")
yenidosya.close()

In [297]:

#varolana yaz, sonuna ekleme
yenidosya=io.open("test.txt","a")
yenidosya.write("\nselam")
yenidosya.flush() #hemen yazsın. bunu kullanmazsak yaptığımız değişiklikleri hemen görmeyiz

Out[297]:

Güvenli dosya işlemleri

Dosyalarla işiniz bitince kapatmak önemlidir. Kapandığından emin olmak için with bloğu içinde yazmak gerekir

In [298]:

with io.open("test.txt", "r") as dosya:
    print(dosya.read())

selam

In [299]:

#hem okuma hem yazma moduyla açıp başa bilgi ekleme
with io.open("test.txt", "r+") as f:
    content = f.read()
    f.seek(0) #Dosyayı başa sarıyoruz
    f.write("volkan\n"+content)

Out[299]:

In [300]:

!rm test.txt

Türkçe karakter

utf-8 mi cp1254 mü?

https://python-istihza.yazbel.com/karakter_kodlama.html

In [301]:

import locale
locale.getpreferredencoding()

Out[301]:

'UTF-8'

In [302]:

with io.open("/content/drive/MyDrive/Programming/PythonRocks/mypyext/ml.py", "r", encoding='cp1254') as f:
    print(f.read())

import numpy as np
import pandas as pd
from sklearn.metrics import silhouette_samples, silhouette_score
from sklearn.metrics import confusion_matrix, accuracy_score, recall_score, precision_score, f1_score
from sklearn.metrics import roc_curve, precision_recall_curve, auc
from sklearn.metrics import mean_squared_error,mean_absolute_error,r2_score
import matplotlib.cm as cm
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from sklearn.cluster import KMeans
from sklearn.neighbors import NearestNeighbors
from sklearn.preprocessing import LabelEncoder, OneHotEncoder,binarize
from sklearn.pipeline import Pipeline 
import os, sys, site
import itertools    
from numpy.random import uniform
from random import sample, seed
from math import isnan
from multiprocessing import Pool
from scipy.spatial import distance
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.model_selection import learning_curve
import networkx as nx
from sklearn.experimental import enable_halving_search_cv 
from sklearn.model_selection import RandomizedSearchCV,GridSearchCV,HalvingGridSearchCV,HalvingRandomSearchCV
import warnings
import statsmodels.api as sm
import statsmodels.stats.api as sms
import statsmodels.formula.api as smf

def adjustedr2(R_sq,y,y_pred,x):
    return 1 - (1-R_sq)*(len(y)-1)/(len(y_pred)-x.shape[1]-1)

def calculate_aic_bic(n, mse, num_params):
    """
        n=number of instances in y        
    """    
    aic = n *np.log(mse) + 2 * num_params
    bic = n * np.log(mse) + num_params * np.log(n)
    # ssr = fitted.ssr #residual sum of squares
    # AIC = N + N*np.log(2.0*np.pi*ssr/N)+2.0*(p+1)
    # print(AIC)
    # BIC = N + N*np.log(2.0*np.pi*ssr/N) + p*np.log(N)
    # print(BIC)
    return aic, bic   

    
def printScores(y_test,y_pred,x=None,*, alg_type='c',f1avg=None):
    """    
    prints the available performanse scores.
    Args:
    alg_type: c for classfication, r for regressin
    f1avg: if None, taken as binary.
    """
    if alg_type=='c':
        acc=accuracy_score(y_test,y_pred)
        print("Accuracy:",acc)
        recall=recall_score(y_test,y_pred)
        print("Recall:",recall)
        precision=precision_score(y_test,y_pred)
        print("Precision:",precision)
        if f1avg is None:
            f1=f1_score(y_test,y_pred)
        else:
            f1=f1_score(y_test,y_pred,average=f1avg)
        print("F1:",f1)
        return acc,recall,precision,f1
    else:
        mse=mean_squared_error(y_test,y_pred) #RMSE iÃ§in squared=False yapÄ±labilir ama bize mse de lazÄ±m
        rmse=round(np.sqrt(mse),2)
        print("RMSE:",rmse)
        mae=round(mean_absolute_error(y_test,y_pred),2)
        print("MAE:",mae)        
        r2=round(r2_score(y_test,y_pred),2)
        print("r2:",r2)
        adjr2=round(adjustedr2(r2_score(y_test,y_pred),y_test,y_pred,x),2)
        print("Adjusted R2:",adjr2)
        aic, bic=calculate_aic_bic(len(y_test),mse,len(x))
        print("AIC:",round(aic,2))
        print("BIC:",round(bic,2))
        return (rmse,mae,r2,adjr2,round(aic,2),round(bic,2))

def draw_sihoutte(range_n_clusters,data,isbasic=True,printScores=True,random_state=42):
    """
    - isbasic:if True, plots scores as line chart whereas false, plots the sihoutte chart.
    - taken from https://scikit-learn.org/stable/auto_examples/cluster/plot_kmeans_silhouette_analysis.html and modified as needed.
    """
    if isbasic==False:
        silhouette_max=0
        for n_clusters in range_n_clusters:
        # Create a subplot with 1 row and 2 columns
            fig, (ax1, ax2) = plt.subplots(1, 2)
            fig.set_size_inches(12,4)

            ax1.set_xlim([-1, 1])
            # The (n_clusters+1)*10 is for inserting blank space between silhouette
            # plots of individual clusters, to demarcate them clearly.
            ax1.set_ylim([0, len(data) + (n_clusters + 1) * 10])

            # Initialize the clusterer with n_clusters value and a random generator
            # seed of 10 for reproducibility.
            clusterer = KMeans(n_clusters=n_clusters, random_state=random_state)
            cluster_labels = clusterer.fit_predict(data)

            # The silhouette_score gives the average value for all the samples.
            # This gives a perspective into the density and separation of the formed
            # clusters
            silhouette_avg = silhouette_score(data, cluster_labels)
            if silhouette_avg>silhouette_max:
                silhouette_max,nc=silhouette_avg,n_clusters
            print("For n_clusters =", n_clusters,
                "The average silhouette_score is :", silhouette_avg)

            # Compute the silhouette scores for each sample
            sample_silhouette_values = silhouette_samples(data, cluster_labels)

            y_lower = 10
            for i in range(n_clusters):
                # Aggregate the silhouette scores for samples belonging to
                # cluster i, and sort them
                ith_cluster_silhouette_values = \
                    sample_silhouette_values[cluster_labels == i]

                ith_cluster_silhouette_values.sort()

                size_cluster_i = ith_cluster_silhouette_values.shape[0]
                y_upper = y_lower + size_cluster_i

                color = cm.nipy_spectral(float(i) / n_clusters)
                ax1.fill_betweenx(np.arange(y_lower, y_upper),
                                0, ith_cluster_silhouette_values,
                                facecolor=color, edgecolor=color, alpha=0.7)

                # Label the silhouette plots with their cluster numbers at the middle
                ax1.text(-0.05, y_lower + 0.5 * size_cluster_i, str(i))

                # Compute the new y_lower for next plot
                y_lower = y_upper + 10  # 10 for the 0 samples

            ax1.set_title("The silhouette plot for the various clusters.")
            ax1.set_xlabel("The silhouette coefficient values")
            ax1.set_ylabel("Cluster label")

            # The vertical line for average silhouette score of all the values
            ax1.axvline(x=silhouette_avg, color="red", linestyle="--")

            ax1.set_yticks([])  # Clear the yaxis labels / ticks
            ax1.set_xticks([-0.1, 0, 0.2, 0.4, 0.6, 0.8, 1])

            # 2nd Plot showing the actual clusters formed
            colors = cm.nipy_spectral(cluster_labels.astype(float) / n_clusters)
            ax2.scatter(data[:, 0], data[:, 1], marker='.', s=30, lw=0, alpha=0.7,
                        c=colors, edgecolor='k')

            # Labeling the clusters
            centers = clusterer.cluster_centers_
            # Draw white circles at cluster centers
            ax2.scatter(centers[:, 0], centers[:, 1], marker='o',
                        c="white", alpha=1, s=200, edgecolor='k')

            for i, c in enumerate(centers):
                ax2.scatter(c[0], c[1], marker='$%d$' % i, alpha=1,
                            s=50, edgecolor='k')

            ax2.set_title("The visualization of the clustered data.")
            ax2.set_xlabel("Feature space for the 1st feature")
            ax2.set_ylabel("Feature space for the 2nd feature")

            plt.suptitle(("Silhouette analysis for KMeans clustering on sample data "
                        "with n_clusters = %d" % n_clusters),
                        fontsize=14, fontweight='bold')            
            plt.show()
        print(f"Best score is {silhouette_max} for {nc}")
    else:
        ss = []
        for n in range_n_clusters:
            kmeans = KMeans(n_clusters=n, random_state=random_state)
            kmeans.fit_transform(data)
            labels = kmeans.labels_
            score = silhouette_score(data, labels)
            ss.append(score)
            if printScores==True:
                print(n,score)
        plt.plot(range_n_clusters,ss)
        plt.xticks(range_n_clusters) #so it shows all the ticks

def drawEpsilonDecider(data,n):
    """
    for DBSCAN
    n: # of neighbours(in the nearest neighbour calculation, the point itself will appear as the first nearest neighbour. so, this should be
    given as min_samples+1.
    data:numpy array
    """
    neigh = NearestNeighbors(n_neighbors=n)
    nbrs = neigh.fit(data)
    distances, indices = nbrs.kneighbors(data)
    distances = np.sort(distances, axis=0)
    distances = distances[:,1]
    plt.ylabel("eps")
    plt.xlabel("number of data points")
    plt.plot(distances)
    
def draw_elbow(ks,data):
    wcss = []
    for i in ks:
        kmeans = KMeans(n_clusters=i, init='k-means++', max_iter=300, n_init=10, random_state=0) 
        kmeans.fit(data)
        wcss.append(kmeans.inertia_)
    plt.plot(ks, wcss)
    plt.title('Elbow Method')
    plt.xlabel('# of clusters')
    plt.ylabel('WCSS')
    plt.xticks(ks)
    plt.show()
    
def biplot(score,coeff,y,variance,labels=None):
    """
    PCA biplot.
    Found at https://stackoverflow.com/questions/39216897/plot-pca-loadings-and-loading-in-biplot-in-sklearn-like-rs-autoplot
    """
    xs = score[:,0]
    ys = score[:,1]
    n = coeff.shape[0]
    scalex = 1.0/(xs.max() - xs.min())
    scaley = 1.0/(ys.max() - ys.min())
    plt.scatter(xs * scalex,ys * scaley, c = y)
    for i in range(n):
        plt.arrow(0, 0, coeff[i,0], coeff[i,1],color = 'r',alpha = 0.5)
        if labels is None:
            plt.text(coeff[i,0]* 1.15, coeff[i,1] * 1.15, "Var"+str(i+1), color = 'g', ha = 'center', va = 'center')
        else:
            plt.text(coeff[i,0]* 1.15, coeff[i,1] * 1.15, labels[i], color = 'g', ha = 'center', va = 'center')
    plt.xlim(-1,1)
    plt.ylim(-1,1)
    plt.xlabel("PC{},Variance:{}".format(1,variance[0]))
    plt.ylabel("PC{},Variance:{}".format(2,variance[1]))
    plt.grid()

    
       

def get_feature_names_from_columntransformer(ct):
    """
        returns feature names in a dataframe passet to a column transformer. Useful if you have lost the names due to conversion to numpy.
        if it doesn't work, try out the one at https://johaupt.github.io/blog/columnTransformer_feature_names.html or at https://lifesaver.codes/answer/cannot-get-feature-names-after-columntransformer-12525
    """
    final_features=[]
    try:
        
        for trs in ct.transformers_:
            trName=trs[0]
            trClass=trs[1]
            features=trs[2]
            if isinstance(trClass,Pipeline):   
                n,tr=zip(*trClass.steps)
                for t in tr: #t is a transformator object, tr is the list of all transoformators in the pipeline                
                    if isinstance(t,OneHotEncoder):
                        for f in t.get_feature_names_out(features):
                            final_features.append("OHE_"+f) 
                        break
                else: #if not found onehotencoder, add the features directly
                    for f in features:
                        final_features.append(f)                
            elif isinstance(trClass,OneHotEncoder): #?type(trClass)==OneHotEncoder:
                for f in trClass.get_feature_names_out(features):
                    final_features.append("OHE_"+f) 
            else:
                #remainders
                if trName=="remainder":
                    for i in features:
                        final_features.append(ct.feature_names_in_[i])
                #all the others
                else:
                    for f in features:
                        final_features.append(f)                
    except AttributeError:
        print("Your sklearn version may be old and you may need to upgrade it via 'python -m pip install scikit-learn -U'")

    return final_features 

def featureImportanceEncoded(feature_importance_array,feature_names,figsize=(8,6)):
    """
        plots the feature importance plot.
        feature_importance_array:feature_importance_ attribute
    """
    plt.figure(figsize=figsize)
    dfimp=pd.DataFrame(feature_importance_array.reshape(-1,1).T,columns=feature_names).T
    dfimp.index.name="Encoded"
    dfimp.rename(columns={0: "Importance"},inplace=True)
    dfimp.reset_index(inplace=True)
    dfimp["Feature"]=dfimp["Encoded"].apply(lambda x:x[4:].split('_')[0] if "OHE" in x else x)
    dfimp.groupby(by='Feature')["Importance"].sum().sort_values().plot(kind='barh');
    
    
def compareEstimatorsInGridSearch(gs,tableorplot='plot',figsize=(10,5),est="param_clf"):
    """
        Gives a comparison table/plot of the estimators in a gridsearch.
    """
    cvres = gs.cv_results_
    cv_results = pd.DataFrame(cvres)
    cv_results[est]=cv_results[est].apply(lambda x:str(x).split('(')[0])
    cols={"mean_test_score":"MAX of mean_test_score","mean_fit_time":"MIN of mean_fit_time"}
    summary=cv_results.groupby(by=est).agg({"mean_test_score":"max", "mean_fit_time":"min"}).rename(columns=cols)
    summary.sort_values(by='MAX of mean_test_score', ascending=False,inplace=True)
    
    
    if tableorplot=='table':
        return summary
    else:
        _, ax1 = plt.subplots(figsize=figsize)
        color = 'tab:red'
        ax1.xaxis.set_ticks(range(len(summary)))
        ax1.set_xticklabels(summary.index, rotation=45,ha='right')
                
        ax1.set_ylabel('MAX of mean_test_score', color=color)
        ax1.bar(summary.index, summary['MAX of mean_test_score'], color=color)
        ax1.tick_params(axis='y', labelcolor=color)
        ax1.set_ylim(0,summary["MAX of mean_test_score"].max()*1.1)

        ax2 = ax1.twinx() 
        color = 'tab:blue'
        ax2.set_ylabel('MIN of mean_fit_time', color=color) 
        ax2.plot(summary.index, summary['MIN of mean_fit_time'], color=color)
        ax2.tick_params(axis='y', labelcolor=color)        
        ax2.set_ylim(0,summary["MIN of mean_fit_time"].max()*1.1)

        plt.show()      

def plot_confusion_matrix(cm, classes,
                          normalize=False,
                          title='Confusion matrix',
                          cmap=plt.cm.Blues):
    """
    Depreceated. use 'sklearn.metrics.ConfusionMatrixDisplay(cm).plot();'
    """
    warnings.warn("use 'sklearn.metrics.ConfusionMatrixDisplay(cm).plot();'")      
    
def CheckForClusteringTendencyWithHopkins(X,random_state=42):    
    """
    taken from https://matevzkunaver.wordpress.com/2017/06/20/hopkins-test-for-cluster-tendency/
    X:numpy array or dataframe
    the closer to 1, the higher probability of clustering tendency    
    X must be scaled priorly.
    """
        
    d = X.shape[1]
    #d = len(vars) # columns
    n = len(X) # rows
    m = int(0.1 * n) # heuristic from article [1]
    if type(X)==np.ndarray:
        nbrs = NearestNeighbors(n_neighbors=1).fit(X)
    else:
        nbrs = NearestNeighbors(n_neighbors=1).fit(X.values)
    seed(random_state) 
    rand_X = sample(range(0, n, 1), m)
 
    ujd = []
    wjd = []
    for j in range(0, m):
        #-------------------bi ara random state yap----------
        u_dist, _ = nbrs.kneighbors(uniform(np.amin(X,axis=0),np.amax(X,axis=0),d).reshape(1, -1), 2, return_distance=True)
        ujd.append(u_dist[0][1])
        if type(X)==np.ndarray:
            w_dist, _ = nbrs.kneighbors(X[rand_X[j]].reshape(1, -1), 2, return_distance=True)
        else:
            w_dist, _ = nbrs.kneighbors(X.iloc[rand_X[j]].values.reshape(1, -1), 2, return_distance=True)
        wjd.append(w_dist[0][1])
 
    H = sum(ujd) / (sum(ujd) + sum(wjd))
    if isnan(H):
        print(ujd, wjd)
        H = 0
 
    return H    

def getNumberofCatsAndNumsFromDatasets(path,size=10_000_000):
    """
    returns the number of features by their main type(i.e categorical or numeric or datetime)
    args:
        path:path of the files residing in.
        size:size of the file(default is ~10MB). if chosen larger, it will take longer to return.
    """
    os.chdir(path)
    files=os.listdir()
    liste=[]
    for d in files:  
        try:
            if os.path.isfile(d) and os.path.getsize(d)<size:        
                if os.path.splitext(d)[1]==".csv":
                    df=pd.read_csv(d,encoding = "ISO-8859-1")
                elif os.path.splitext(d)[1]==".xlsx":
                    df=pd.read_excel(d)
                else:            
                    continue      

                nums=len(df.select_dtypes("number").columns)        
                date=len(df.select_dtypes(include=[np.datetime64]).columns)
                cats=len(df.select_dtypes("O").columns)-date
                liste.append((d,nums,cats,date))
        except:
            pass

    dffinal=pd.DataFrame(liste,columns=["filename","numeric","categorical","datettime"])
    dffinal.set_index("filename")
    return dffinal


def plot_learning_curve(
    estimator,
    title,
    X,
    y,
    axes=None,
    ylim=None,
    cv=None,
    n_jobs=None,
    train_sizes=np.linspace(0.1, 1.0, 5),
    random_state=42
):
    """
    https://scikit-learn.org/stable/auto_examples/model_selection/plot_learning_curve.html#sphx-glr-auto-examples-model-selection-plot-learning-curve-py
    Generate 3 plots: the test and training learning curve, the training
    samples vs fit times curve, the fit times vs score curve.
    The plot in the second column shows the times required by the models to train 
    with various sizes of training dataset. The plot in the third columns shows 
    how much time was required to train the models for each training sizes.

    Parameters
    ----------
    estimator : estimator instance
        An estimator instance implementing `fit` and `predict` methods which
        will be cloned for each validation.

    title : str
        Title for the chart.

    X : array-like of shape (n_samples, n_features)
        Training vector, where ``n_samples`` is the number of samples and
        ``n_features`` is the number of features.

    y : array-like of shape (n_samples) or (n_samples, n_features)
        Target relative to ``X`` for classification or regression;
        None for unsupervised learning.

    axes : array-like of shape (3,), default=None
        Axes to use for plotting the curves.

    ylim : tuple of shape (2,), default=None
        Defines minimum and maximum y-values plotted, e.g. (ymin, ymax).

    cv : int, cross-validation generator or an iterable, default=None
        Determines the cross-validation splitting strategy.
        Possible inputs for cv are:

          - None, to use the default 5-fold cross-validation,
          - integer, to specify the number of folds.
          - :term:`CV splitter`,
          - An iterable yielding (train, test) splits as arrays of indices.

        For integer/None inputs, if ``y`` is binary or multiclass,
        :class:`StratifiedKFold` used. If the estimator is not a classifier
        or if ``y`` is neither binary nor multiclass, :class:`KFold` is used.

        Refer :ref:`User Guide <cross_validation>` for the various
        cross-validators that can be used here.

    n_jobs : int or None, default=None
        Number of jobs to run in parallel.
        ``None`` means 1 unless in a :obj:`joblib.parallel_backend` context.
        ``-1`` means using all processors. See :term:`Glossary <n_jobs>`
        for more details.

    train_sizes : array-like of shape (n_ticks,)
        Relative or absolute numbers of training examples that will be used to
        generate the learning curve. If the ``dtype`` is float, it is regarded
        as a fraction of the maximum size of the training set (that is
        determined by the selected validation method), i.e. it has to be within
        (0, 1]. Otherwise it is interpreted as absolute sizes of the training
        sets. Note that for classification the number of samples usually have
        to be big enough to contain at least one sample from each class.
        (default: np.linspace(0.1, 1.0, 5))
    """
    if axes is None:
        _, axes = plt.subplots(1, 3, figsize=(20, 5))

    axes[0].set_title(title)
    if ylim is not None:
        axes[0].set_ylim(*ylim)
    axes[0].set_xlabel("Training examples")
    axes[0].set_ylabel("Score")

    train_sizes, train_scores, test_scores, fit_times, _ = learning_curve(
        estimator,
        X,
        y,
        cv=cv,
        n_jobs=n_jobs,
        train_sizes=train_sizes,
        return_times=True,
        random_state=random_state
    )
    train_scores_mean = np.mean(train_scores, axis=1)
    train_scores_std = np.std(train_scores, axis=1)
    test_scores_mean = np.mean(test_scores, axis=1)
    test_scores_std = np.std(test_scores, axis=1)
    fit_times_mean = np.mean(fit_times, axis=1)
    fit_times_std = np.std(fit_times, axis=1)

    # Plot learning curve
    axes[0].grid()
    axes[0].fill_between(
        train_sizes,
        train_scores_mean - train_scores_std,
        train_scores_mean + train_scores_std,
        alpha=0.1,
        color="r",
    )
    axes[0].fill_between(
        train_sizes,
        test_scores_mean - test_scores_std,
        test_scores_mean + test_scores_std,
        alpha=0.1,
        color="g",
    )
    axes[0].plot(
        train_sizes, train_scores_mean, "o-", color="r", label="Training score"
    )
    axes[0].plot(
        train_sizes, test_scores_mean, "o-", color="g", label="Cross-validation score"
    )
    axes[0].legend(loc="best")

    # Plot n_samples vs fit_times
    axes[1].grid()
    axes[1].plot(train_sizes, fit_times_mean, "o-")
    axes[1].fill_between(
        train_sizes,
        fit_times_mean - fit_times_std,
        fit_times_mean + fit_times_std,
        alpha=0.1,
    )
    axes[1].set_xlabel("Training examples")
    axes[1].set_ylabel("fit_times")
    axes[1].set_title("Scalability of the model")

    # Plot fit_time vs score
    fit_time_argsort = fit_times_mean.argsort()
    fit_time_sorted = fit_times_mean[fit_time_argsort]
    test_scores_mean_sorted = test_scores_mean[fit_time_argsort]
    test_scores_std_sorted = test_scores_std[fit_time_argsort]
    axes[2].grid()
    axes[2].plot(fit_time_sorted, test_scores_mean_sorted, "o-")
    axes[2].fill_between(
        fit_time_sorted,
        test_scores_mean_sorted - test_scores_std_sorted,
        test_scores_mean_sorted + test_scores_std_sorted,
        alpha=0.1,
    )
    axes[2].set_xlabel("fit_times")
    axes[2].set_ylabel("Score")
    axes[2].set_title("Performance of the model")

    plt.show()


def drawNeuralNetwork(layers,figsize=(10,8)):
    """
        Draws a represantion of the neural network using networkx.
        layers:list of the # of layers including input and output.
    """    
    plt.figure(figsize=figsize)
    pos={}    
    for e,l in enumerate(layers):
        for i in range(l):
            pos[str(l)+"_"+str(i)]=((e+1)*50,i*5+50)


    X=nx.Graph()
    nx.draw_networkx_nodes(X,pos,nodelist=pos.keys(),node_color='r')
    X.add_nodes_from(pos.keys())

    edgelist=[] #list of tuple
    for e,l in enumerate(layers):
        for i in range(l):
            try:
                for k in range(layers[e+1]):
                    try:
                        edgelist.append((str(l)+"_"+str(i),str(layers[e+1])+"_"+str(k)))
                    except:
                        pass
            except:
                    pass


    X.add_edges_from(edgelist)
    for n, p in pos.items():
        X.nodes[n]['pos'] = p    

    nx.draw(X, pos);    

def draw_network_graph(ws):
    """
    Draws a network graph of a neural network with dynamic weights.

    Args:
    ws: A list of weight matrices.
    """
    # Create a directed graph
    G = nx.DiGraph()

    # Add nodes
    layer_count = len(ws) + 1  # Include input layer
    node_count = [ws[0].shape[0]] + [w.shape[1] for w in ws]

    for layer in range(layer_count):
        if layer == 0:
            for i in range(node_count[layer]):
                G.add_node(f"Input {i+1}", layer=layer)
        elif layer == layer_count - 1:
            for i in range(node_count[layer]):
                G.add_node(f"Output {i+1}", layer=layer)
        else:
            for i in range(node_count[layer]):
                G.add_node(f"Hidden {layer}_{i+1}", layer=layer)

    # Add edges with weights
    for layer in range(layer_count - 1):
        for i in range(node_count[layer]):
            for j in range(node_count[layer + 1]):
                if layer == 0:
                    G.add_edge(f"Input {i+1}", f"Hidden {layer + 1}_{j+1}", weight=ws[layer][i, j])
                elif layer == layer_count - 2:
                    G.add_edge(f"Hidden {layer}_{i+1}", f"Output {j+1}", weight=ws[layer][i, j])
                else:
                    G.add_edge(f"Hidden {layer}_{i+1}", f"Hidden {layer + 1}_{j+1}", weight=ws[layer][i, j])

    # Draw the graph
    pos = {}
    pos.update({node: (0, i) for i, node in enumerate(
        [f"Input {i+1}" for i in range(node_count[0])]
    )})
    pos.update({node: (layer, i) for layer in range(1, layer_count -1) for i, node in enumerate(
        [f"Hidden {layer}_{i+1}" for i in range(node_count[layer])]
    )})
    pos.update({node: (layer_count - 1, i) for i, node in enumerate(
        [f"Output {i+1}" for i in range(node_count[-1])]
    )})

    nx.draw(G, pos, with_labels=True, node_size=1000, node_color='lightblue', font_size=10)

    # Add edge labels with weights
    edge_labels = {(u, v): f"{d['weight']:.2f}" for u, v, d in G.edges(data=True)}
    nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels)

    plt.show()

def plotROC(y_test,X_test,estimator,pos_label=1,figsize=(6,6)):
    cm = confusion_matrix(y_test, estimator.predict(X_test))    
    fpr, tpr, _ = roc_curve(y_test, estimator.predict_proba(X_test)[:,1],pos_label=pos_label)
    roc_auc = auc(fpr, tpr) #or roc_auc_score(y_test, y_scores)
    plt.figure(figsize=figsize)
    plt.plot(fpr, tpr, label='(ROC-AUC = %0.2f)' % roc_auc)
    plt.plot([0, 1], [0, 1], 'k--')
    tn, fp, fn, tp = [i for i in cm.ravel()]
    plt.plot(fp/(fp+tn), tp/(tp+fn), 'ro', markersize=8, label='Decision Point(Optimal threshold)')
    plt.xlim([0.0, 1.0])
    plt.ylim([0.0, 1.05])
    plt.xlabel('False Positive Rate(1-sepecifity)')
    plt.ylabel('True Positive Rate(Recall/Sensitivity)')
    plt.title('ROC Curve (TPR vs FPR at each probability threshold)')
    plt.legend(loc="lower right")
    plt.show();

def plot_precision_recall_curve(y_test_encoded,X_test,estimator,threshs=np.linspace(0.0, 0.98, 40),figsize=(16,6)):
    """
        y_test should be labelencoded.
    """
    pred_prob = estimator.predict_proba(X_test)    
    precision, recall, thresholds = precision_recall_curve(y_test_encoded, pred_prob[:,1])
    pr_auc = auc(recall, precision)    

    Xt = [] ; Yp = [] ; Yr = [] 
    for thresh in threshs:
        with warnings.catch_warnings():
            warnings.filterwarnings("error")
            try:
                y_pred = binarize(pred_prob, threshold=thresh)[:,1]
                Xt.append(thresh)
                Yp.append(precision_score(y_test_encoded, y_pred)) #,zero_division=1
                Yr.append(recall_score(y_test_encoded, y_pred))
            except Warning as e:
                print(f"{thresh:.2f}, error , probably division by zero")

        
    plt.figure(figsize=figsize)
    plt.subplot(121)
    plt.plot(Xt, Yp, "--", label='Precision', color='red')
    plt.plot(Xt, Yr, "--", label='Recall', color='blue')
    plt.title("Precision vs Recall based on decision threshold")
    plt.xlabel('Decision threshold') ; plt.ylabel('Precision - Recall')
    plt.legend()
    plt.subplot(122)
    plt.step(Yr, Yp, color='black', label='LR (PR-AUC = %0.2f)' % pr_auc)
    # calculate the no skill line as the proportion of the positive class (0.145)
    no_skill = len(y_test_encoded[y_test_encoded==1]) / len(y_test_encoded)
    # plot the no skill precision-recall curve
    plt.plot([0, 1], [no_skill, no_skill], linestyle='--', color='green', label='No Skill')
    # plot the perfect PR curve
    plt.plot([0, 1],[1, 1], color='blue', label='Perfect')
    plt.plot([1, 1],[1, len(y_test_encoded[y_test_encoded==1]) / len(y_test_encoded)], color='blue')
    plt.title('PR Curve')
    plt.xlabel('Recall: TP / (TP+FN)') ; plt.ylabel('Precison: TP / (TP+FP)')
    plt.legend(loc="upper right")
    plt.show();


def find_best_cutoff_for_classification(estimator, y_test_le, X_test, costlist,threshs=np.linspace(0., 0.98, 20)):    
    """
    y_test should be labelencoded as y_test_le
    costlist=cost list for TN, TP, FN, FP
    """
    y_pred_prob = estimator.predict_proba(X_test)
    y_pred = estimator.predict(X_test)
    Xp = [] ; Yp = [] # initialization

    print("Cutoff\t Cost/Instance\t Accuracy\t FN\t FP\t TP\t TN\t Recall\t Precision F1-score")
    for cutoff in threshs:
        with warnings.catch_warnings():
            warnings.filterwarnings("error")
            try:
                y_pred = binarize(y_pred_prob, threshold=cutoff)[:,1]
                cm = confusion_matrix(y_test_le, y_pred)
                TP = cm[1,1]
                TN = cm[0,0]
                FP = cm[0,1]
                FN = cm[1,0]
                cost = costlist[0]*TN + costlist[1]*TP + costlist[2]*FN + costlist[3]*FP
                cost_per_instance = cost/len(y_test_le)
                Xp.append(cutoff)
                Yp.append(cost_per_instance)
                acc=accuracy_score(y_test_le, y_pred)
                rec = cm[1,1]/(cm[1,1]+cm[1,0])
                pre = cm[1,1]/(cm[1,1]+cm[0,1])
                f1  = 2*pre*rec/(pre+rec)
                print(f"{cutoff:.2f}\t {cost_per_instance:.2f}\t\t {acc:.3f}\t\t {FN}\t {FP}\t {TP}\t {TN}\t {rec:.3f}\t {pre:.3f}\t   {f1:.3f}")
            except Warning as e:
                print(f"{cutoff:.2f}\t {cost_per_instance:.2f}\t\t {acc:.3f}\t\t {FN}\t {FP}\t {TP}\t {TN}\t error might have happened from here anywhere")

    plt.figure(figsize=(10,6))
    plt.plot(Xp, Yp)
    plt.xlabel('Threshold value for probability')
    plt.ylabel('Cost per instance')
    plt.axhline(y=min(Yp), xmin=0., xmax=1., linewidth=1, color = 'r')
    plt.show();


def plot_gain_and_lift(estimator,X_test,y_test,pos_label="Yes",figsize=(16,6)):
    """    
        y_test as numpy array
        prints the gain and lift values and plots the charts.    
    """
    prob_df=pd.DataFrame({"Prob":estimator.predict_proba(X_test)[:,1]})
    prob_df["label"]=np.where(y_test==pos_label,1,0)
    prob_df = prob_df.sort_values(by="Prob",ascending=False)
    prob_df['Decile'] = pd.qcut(prob_df['Prob'], 10, labels=list(range(1,11))[::-1])

    #Calculate the actual churn in each decile
    res = pd.crosstab(prob_df['Decile'], prob_df['label'])[1].reset_index().rename(columns = {1: 'Number of Responses'})
    lg = prob_df['Decile'].value_counts(sort = False).reset_index().rename(columns = {'Decile': 'Number of Cases', 'index': 'Decile'})
    lg = pd.merge(lg, res, on = 'Decile').sort_values(by = 'Decile', ascending = False).reset_index(drop = True)
    #Calculate the cumulative
    lg['Cumulative Responses'] = lg['Number of Responses'].cumsum()
    #Calculate the percentage of positive in each decile compared to the total nu
    lg['% of Events'] = np.round(((lg['Number of Responses']/lg['Number of Responses'].sum())*100),2)
    #Calculate the Gain in each decile
    lg['Gain'] = lg['% of Events'].cumsum()
    lg['Decile'] = lg['Decile'].astype('int')
    lg['lift'] = np.round((lg['Gain']/(lg['Decile']*10)),2)
    display(lg)

    plt.figure(figsize=figsize)
    plt.subplot(121)
    plt.plot(lg["Decile"],lg["lift"],label="Model")
    plt.plot(lg["Decile"],[1 for i in range(10)],label="Random")
    plt.title("Lift Chart")
    plt.legend()
    plt.xlabel("Decile")
    plt.ylabel("Lift")    
    
    plt.subplot(122)
    plt.plot(lg["Decile"],lg["Gain"],label="Model")
    plt.plot(lg["Decile"],[10*(i+1) for i in range(10)],label="Random")
    plt.title("Gain Chart")
    plt.legend()
    plt.xlabel("Decile")
    plt.ylabel("Gain")
    plt.xlim(0,11)
    plt.ylim(0,110)
    plt.show();    

def plot_gain_and_lift_orj(estimator,X_test,y_test,pos_label="Yes"):
    """    
        y_test as numpy array
        prints the gain and lift values and plots the charts.    
    """
    prob_df=pd.DataFrame({"Prob":estimator.predict_proba(X_test)[:,1]})
    prob_df["label"]=np.where(y_test==pos_label,1,0)
    prob_df = prob_df.sort_values(by="Prob",ascending=False)
    prob_df['Decile'] = pd.qcut(prob_df['Prob'], 10, labels=list(range(1,11))[::-1])

    #Calculate the actual churn in each decile
    res = pd.crosstab(prob_df['Decile'], prob_df['label'])[1].reset_index().rename(columns = {1: 'Number of Responses'})
    lg = prob_df['Decile'].value_counts(sort = False).reset_index().rename(columns = {'Decile': 'Number of Cases', 'index': 'Decile'})
    lg = pd.merge(lg, res, on = 'Decile').sort_values(by = 'Decile', ascending = False).reset_index(drop = True)
    #Calculate the cumulative
    lg['Cumulative Responses'] = lg['Number of Responses'].cumsum()
    #Calculate the percentage of positive in each decile compared to the total nu
    lg['% of Events'] = np.round(((lg['Number of Responses']/lg['Number of Responses'].sum())*100),2)
    #Calculate the Gain in each decile
    lg['Gain'] = lg['% of Events'].cumsum()
    lg['Decile'] = lg['Decile'].astype('int')
    lg['lift'] = np.round((lg['Gain']/(lg['Decile']*10)),2)
    display(lg)
    
    plt.plot(lg["Decile"],lg["lift"],label="Model")
    plt.plot(lg["Decile"],[1 for i in range(10)],label="Random")
    plt.title("Lift Chart")
    plt.legend()
    plt.xlabel("Decile")
    plt.ylabel("Lift")
    plt.show();
    
    plt.plot(lg["Decile"],lg["Gain"],label="Model")
    plt.plot(lg["Decile"],[10*(i+1) for i in range(10)],label="Random")
    plt.title("Gain Chart")
    plt.legend()
    plt.xlabel("Decile")
    plt.ylabel("Gain")
    plt.xlim(0,11)
    plt.ylim(0,110)
    plt.show();    

def linear_model_feature_importance(estimator,preprocessor,feature_selector=None,clfreg_name="clf"):
    """
    plots the feature importance, namely coefficients for linear models.
    args:
        estimator:either pipeline or gridsearch/randomizedsearch object
        preprocessor:variable name of the preprocessor, which is a columtransformer
        feature_selector:if there is a feature selector step, its name.
        clfreg_name:name of the linear model, usually clf for a classifier, reg for a regressor        
    """
                
    if feature_selector is not None:
        if isinstance(estimator,GridSearchCV) or isinstance(estimator,RandomizedSearchCV)\
         or isinstance(estimator,HalvingGridSearchCV) or isinstance(estimator,HalvingRandomSearchCV):
            est=estimator.best_estimator_
        elif isinstance(estimator,Pipeline):
            est=estimator
        else:
            print("Either pipeline or gridsearch/randomsearch should be passes for estimator")
            return
        
        selecteds=est[feature_selector].get_support()
        final_features=[x for e,x in enumerate(get_feature_names_from_columntransformer(preprocessor)) if e in np.argwhere(selecteds==True).ravel()]
    else:
        final_features=get_feature_names_from_columntransformer(preprocessor)

    importance=est[clfreg_name].coef_[0]
    plt.bar(final_features, importance)
    plt.xticks(rotation= 45,horizontalalignment="right");    



def gridsearch_to_df(searcher,topN=5):
    """
    searcher: any of grid/randomized searcher objects or their halving versions
    """
    cvresultdf = pd.DataFrame(searcher.cv_results_)
    cvresultdf = cvresultdf.sort_values("mean_test_score", ascending=False)
    cols=[x for x in searcher.cv_results_.keys() if "param_" in x]+["mean_test_score","std_test_score"]
    return cvresultdf[cols].head(topN)   


def getAnotherEstimatorFromGridSearch(gs_object,estimator):
    cvres = gs_object.cv_results_
    cv_results = pd.DataFrame(cvres)
    cv_results["param_clf"]=cv_results["param_clf"].apply(lambda x:str(x).split('(')[0])

    dtc=cv_results[cv_results["param_clf"]==estimator]
    return dtc.getRowOnAggregation_("mean_test_score","max")["params"].values 

def cooksdistance(X,y,figsize=(8,6),ylim=0.5):    
    model = sm.OLS(y,X)
    fitted = model.fit()
    # Cook's distance
    pr=X.shape[1]
    CD = 4.0/(X.shape[0]-pr-1)
    influence = fitted.get_influence()
    #c is the distance and p is p-value
    (c, p) = influence.cooks_distance
    plt.figure(figsize=figsize)
    plt.stem(np.arange(len(c)), c, markerfmt=",")
    plt.axhline(y=CD, color='r')
    plt.ylabel('Cook\'s D')
    plt.xlabel('Observation Number')
    plt.ylim(0,ylim)
    plt.show();

Yazma¶

In [303]:

yenidosya=io.open("writetest.txt","w") #x
yenidosya.write("merhaba\n")
lines=["satır1\n","satır2\n"]
yenidosya.writelines(lines)
yenidosya.close()

Out[303]:

In [304]:

#varolana yaz, sonuna ekleme
yenidosya=io.open("writetest.txt","a")
yenidosya.write("\nselam")
yenidosya.flush() #hemen yazsın. bunu kullanmazsak yaptığımız değişiklikleri hemen görmeyiz
yenidosya.close()

Out[304]:

In [305]:

!rm writetest.txt

Veritabanı işlemleri¶

In [306]:

import sqlite3 as sql #python içinde otomatikman gelir

sqlite sayfasından chinook databaseini indirin

In [307]:

vt = sql.connect('/content/drive/MyDrive/Programming/PythonRocks/dataset/chinook.db')

In [308]:

cur=vt.cursor()

In [309]:

cur.execute("select * from albums")

Out[309]:

<sqlite3.Cursor at 0x7d45a4df2bc0>

In [310]:

cur.fetchmany(3)

Out[310]:

[(1, 'For Those About To Rock We Salute You', 1),
 (2, 'Balls to the Wall', 2),
 (3, 'Restless and Wild', 2)]

In [311]:

veriler = cur.fetchall()

In [312]:

veriler[:5] #ilk 3ünü çektiğimiz için 4ten devam ediyor

Out[312]:

[(4, 'Let There Be Rock', 1),
 (5, 'Big Ones', 3),
 (6, 'Jagged Little Pill', 4),
 (7, 'Facelift', 5),
 (8, 'Warner 25 Anos', 6)]

In [313]:

vt.close()

NOT:sqlite3 çok basit bir veritabanı olup, oracle veya sql server gibi güçlü veritabanlarını sorgulamak için sqlalchemy veya pyodbc gibi modülleri kullanırız ve buradan aldığmız datayı pandas ile işleyebiliriz. Bunun için benim github repomdaki Python Veri Analizi notebookuna bakmanızı tavsiye ederim.

http://sqlitebrowser.org/. sitesi de incelenebilir

Classlar¶

Python nesne yönelimli(oo) bir dildir ve tüm oo dillerde olduğu gibi sınıflar yaratılabilir. Örnek bir sınıf yaratımı aşağıdaki gibi olup detaylar için googlelamanızı rica ederim.

In [314]:

class Araba:
    aractipi="Mekanik" #class seviyesinde, tüm Arabalar tarafından paylaşılan bir değer
    def __init__(self,model,marka,km):
        self.model=model
        self.marka=marka
        self.km=km
        print("yeni araç hazır")
    def run(self):
        print("çalışıyor")
    def stop(self):
        print("durdu")

bmw0=Araba(2011,"bmw",0)
bmw1=Araba(2014,"bmw",0)
audi=Araba(2011,"audi",0)
print(bmw0)
bmw0.run()
bmw0.stop()
print(bmw0.aractipi)
print(audi.aractipi)

yeni araç hazır
yeni araç hazır
yeni araç hazır
<__main__.Araba object at 0x7d45a4bfff70>
çalışıyor
durdu
Mekanik
Mekanik

Paralelleştirme¶

Özellikle analitik model kurma sırasında çok işimize yarayan bir olgudur. Bunun için ayrı bir notebook'um olacak. Önden araştırmak isteyenler şu kavramları araştırabilir. Multi-threading, multiprocessing, conccurency ve parallelism

Verimlilik ve Diğer¶

debugging¶

In [315]:

import pdb
print(4)
pdb.set_trace() #c devam, n:next gibi seçenekler var
print("asda")

PYDEV DEBUGGER WARNING:
sys.settrace() should not be used when the debugger is being used.
This may cause the debugger to stop working correctly.
If this is needed, please check: 
http://pydev.blogspot.com/2007/06/why-cant-pydev-debugger-work-with.html
to see how to restore the debug tracing back correctly.
Call Location:
  File "/usr/lib/python3.10/bdb.py", line 336, in set_trace
    sys.settrace(self.trace_dispatch)

4
--Call--
> /usr/local/lib/python3.10/dist-packages/IPython/core/displayhook.py(252)__call__()
    250         sys.stdout.flush()
    251 
--> 252     def __call__(self, result=None):
    253         """Printing with history cache management.
    254 

ipdb> c

PYDEV DEBUGGER WARNING:
sys.settrace() should not be used when the debugger is being used.
This may cause the debugger to stop working correctly.
If this is needed, please check: 
http://pydev.blogspot.com/2007/06/why-cant-pydev-debugger-work-with.html
to see how to restore the debug tracing back correctly.
Call Location:
  File "/usr/lib/python3.10/bdb.py", line 347, in set_continue
    sys.settrace(None)

asda

memory yönetimi¶

In [317]:

import sys
import array
t=(1,2,3)
l=[1,2,"3"]
a=array.array("l",[1,2,3])
print(sys.getsizeof(t)) #immutabel olduğu için daha az
print(sys.getsizeof(l)) #mutable olduğu için tupldan daha çok, içine farklı tipler alabileceğim için arraydan daha çok
print(sys.getsizeof(a)) #eleman tipi belli olduğu için listten daha az

64
88
104

Cheatsheet¶

Kendinizi test edin¶

Aşağıdaki adreslerden birinden challange sorularını görebilirsiniz.

https://mybinder.org/v2/gh/VolkiTheDreamer/PythonRocks/master (Interaktiftir, download etmenize gerek yok)
üstteki açılmazsa: https://nbviewer.org/github/VolkiTheDreamer/PythonRocks/blob/master/Python%20Challenges.ipynb

Bunun dışında şu sitelerde de pratik yapma imkanı bulabilirsiniz.