https://zzsza.github.io/data/2018/06/09/apache-spark-cluster/
SSH 클릭
keygen 등록
~$ ssh-keygen
~$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
~$ chmod og-wx ~/.ssh/authorized_keys
~$ wget http://mirror.navercorp.com/apache/spark/spark-2.3.0/spark-2.3.0-bin-hadoop2.7.tgz
참고주소 : https://tecadmin.net/install-oracle-java-8-ubuntu-via-ppa/
~$ sudo add-apt-repository ppa:webupd8team/java
~$ sudo apt-get update
~$ sudo apt-get install oracle-java8-installer
~$ tar xvfz spark-2.3.0-bin-hadoop2.7.tgz
~$ cd spark-2.3.0-bin-hadoop2.7/
~/spark-2.3.0-bin-hadoop2.7$ ./bin/spark-shell
~$ cd spark-2.3.0-bin-hadoop2.7/sbin
~/spark-2.3.0-bin-hadoop2.7/sbin$ ./start-master.sh
starting org.apache.spark.deploy.master.Master, logging to /home/sdrlurker2/spark-2.3.0-bin-hadoop2.7/logs/spark-sdrlurker2-org.apache.spark.deploy.master.Master-1-instance-1.out
~/spark-2.3.0-bin-hadoop2.7/sbin$ grep "spark://" /home/sdrlurker2/spark-2.3.0-bin-hadoop2.7/logs/spark-sdrlurker2-org.apache.spark.deploy.master.Master-1-instance-1.out
2018-06-18 14:33:38 INFO Master:54 - Starting Spark master at spark://instance-1.c.arctic-compass-206702.internal:7077
- 예제의 마스터주소 : spark://instance-1.c.arctic-compass-206702.internal:7077
~/spark-2.3.0-bin-hadoop2.7/sbin$ cd ../bin
~/spark-2.3.0-bin-hadoop2.7/bin$ ./spark-shell --master spark://instance-1.c.arctic-compass-206702.internal:7077
scala> sc.makeRDD(List(1,2,3)).count
- 워커가 없어서 실행 안됨.
방화벽 규칙 클릭
방화벽 규칙 만들기
만들기 클릭
기존 instance-1을 종료하고 다음 작업을 진행합니다.
Compute Engine -> 스냅샷
스냅샷 만들기 클릭
~/spark-2.3.0-bin-hadoop2.7/sbin$ ./start-slave.sh
Usage: ./sbin/start-slave.sh [options] <master>
기존 instance-1을 시작 버튼을 눌러 켭니다.
마스터 띄우기
~$ cd spark-2.3.0-bin-hadoop2.7/
~/spark-2.3.0-bin-hadoop2.7$ cd sbin
~/spark-2.3.0-bin-hadoop2.7/sbin$ ./start-master.sh
starting org.apache.spark.deploy.master.Master, logging to /home/sdrlurker2/spark-2.3.0-bin-hadoop2.7/logs/spark-sdrlurker2-org.apache.spark.deploy.master.Master-1-instance-1.out
~/spark-2.3.0-bin-hadoop2.7/bin$ grep "spark://" /home/sdrlurker2/spark-2.3.0-bin-hadoop2.7/logs/spark-sdrlurker2-org.apache.spark.deploy.master.Master-1-instance-1.out
2018-06-19 05:15:04 INFO Master:54 - Starting Spark master at spark://instance-1.c.arctic-compass-206702.internal:7077
~/spark-2.3.0-bin-hadoop2.7/bin$ ./spark-shell --master=spark://instance-1.c.arctic-compass-206702.internal:7077
~$ cd spark-2.3.0-bin-hadoop2.7/conf
~/spark-2.3.0-bin-hadoop2.7/conf$ cp -p slaves.template slaves
~/spark-2.3.0-bin-hadoop2.7/conf$ vi slaves
# slaves 내부 IP 또는 외부 IP 추가.
~/spark-2.3.0-bin-hadoop2.7/sbin$ ./start-slaves.sh
import requests
r = requests.get("http://localhost:8080")
from IPython.display import HTML
HTML(r.text)
Worker Id | Address | State | Cores | Memory |
---|---|---|---|---|
worker-20180619051632-10.146.0.2-37773 | 10.146.0.2:37773 | ALIVE | 1 (1 Used) | 2.6 GB (1024.0 MB Used) |
worker-20180619051632-10.146.0.2-42601 | 10.146.0.2:42601 | ALIVE | 1 (1 Used) | 2.6 GB (1024.0 MB Used) |
worker-20180619051637-10.146.0.3-44837 | 10.146.0.3:44837 | ALIVE | 1 (1 Used) | 2.6 GB (1024.0 MB Used) |
worker-20180619051637-10.146.0.5-38961 | 10.146.0.5:38961 | ALIVE | 1 (1 Used) | 2.6 GB (1024.0 MB Used) |
worker-20180619051639-10.146.0.4-45191 | 10.146.0.4:45191 | ALIVE | 1 (1 Used) | 2.6 GB (1024.0 MB Used) |
Application ID | Name | Cores | Memory per Executor | Submitted Time | User | State | Duration |
---|---|---|---|---|---|---|---|
app-20180619052812-0000 | Spark shell | 5 | 1024.0 MB | 2018/06/19 05:28:12 | sdrlurker2 | RUNNING | 4.4 min |
Application ID | Name | Cores | Memory per Executor | Submitted Time | User | State | Duration |
---|
r = requests.get("http://localhost:8081")
HTML(r.text)
ExecutorID | Cores | State | Memory | Job Details | Logs |
---|---|---|---|---|---|
1 | 1 | RUNNING | 1024.0 MB |
|
stdout stderr |
~/spark-2.3.0-bin-hadoop2.7/bin$ ./spark-shell --master=spark://instance-1.c.arctic-compass-206702.internal:7077
r = requests.get("http://localhost:4040/Jobs")
HTML(r.text)
from IPython.display import IFrame
IFrame('https://www.zepl.com/viewer/notebooks/bm90ZTovL1NEUkx1cmtlci84MjlmMTM4ZDEzZmY0Yjk0YTQ2MDQyNGFjMmZjMTcwYy9ub3RlLmpzb24', width='100%', height=600)