pandas

import pandas as pd

# read_csv, read_exel, read_pdf...
data = pd.read_html('https://ko.wikipedia.org/wiki/%EB%8C%80%ED%95%9C%EB%AF%BC%EA%B5%AD%EC%9D%98_%EC%9D%B8%EA%B5%AC')

data[4]

인구수 = data[4]
사망자수 = 인구수[['사망자수(명)']]
사망자수

사망자수.sum()

사망자수(명)    28836332
dtype: int64

사망자수.sum()[0]

28836332

format(10000000000000, ',')

'10,000,000,000,000'

format(사망자수.sum()[0], ',')

'28,836,332'

공식 홈페이지 튜토리얼

What kind of data does pandas handle?

import pandas as pd

# 그러나 실제 데이터는 대부분 csv로 되어있어, dict로 다루실일이 많이 없을거에요.
df = pd.DataFrame(
            {
                "Name": [
                    "Braund, Mr. Owen Harris",
                    "Allen, Mr. William Henry",
                    "Bonnell, Miss. Elizabeth",
            ],
                "Age": [22, 35, 58],
                "Sex": ["male", "male", "female"],
            }
        )

df

시리즈는 데이터프레임에서 하나의 컬럼입니다.

df["Age"]

0    22
1    35
2    58
Name: Age, dtype: int64

type(df['Age'])

pandas.core.series.Series

type(df)

pandas.core.frame.DataFrame

df[["Age"]]

type(df[["Age"]])

pandas.core.frame.DataFrame

Do something with a DataFrame or Series

df["Age"].max()

58

df["Age"].min()

22

df["Age"].mean()

38.333333333333336

df["Age"].var()

332.3333333333333

df["Age"].std()

18.230011885167087

df.dtypes

Name    object
Age      int64
Sex     object
dtype: object

df.describe()

How do I read and write tabular data?

titanic = pd.read_csv("train.csv")

titanic

titanic.head()

titanic.tail()

titanic.dtypes

PassengerId      int64
Survived         int64
Pclass           int64
Name            object
Sex             object
Age            float64
SibSp            int64
Parch            int64
Ticket          object
Fare           float64
Cabin           object
Embarked        object
dtype: object

titanic.to_excel("titanic.xlsx", sheet_name="passengers", index=False)

titanic_read_excel = pd.read_excel("titanic.xlsx", sheet_name="passengers")
titanic_read_excel

titanic.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 12 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   PassengerId  891 non-null    int64  
 1   Survived     891 non-null    int64  
 2   Pclass       891 non-null    int64  
 3   Name         891 non-null    object 
 4   Sex          891 non-null    object 
 5   Age          714 non-null    float64
 6   SibSp        891 non-null    int64  
 7   Parch        891 non-null    int64  
 8   Ticket       891 non-null    object 
 9   Fare         891 non-null    float64
 10  Cabin        204 non-null    object 
 11  Embarked     889 non-null    object 
dtypes: float64(2), int64(5), object(5)
memory usage: 83.7+ KB

How do I select a subset of a DataFrame?

titanic["Age"].shape

(891,)

titanic["Sex"].shape

(891,)

titanic[["Age", "Sex"]] # 괄호가 하나가 안되는 이유는 DataFrame이기 때문

type(titanic[["Age", "Sex"]])

pandas.core.frame.DataFrame

titanic[["Age", "Sex"]].shape

(891, 2)

How do I filter specific rows from a DataFrame?

above_35 = titanic[titanic["Age"] > 35]
above_35.head(10)

남자 = titanic[titanic["Sex"] == 'male']
남자.head(10)
남자.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 577 entries, 0 to 890
Data columns (total 12 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   PassengerId  577 non-null    int64  
 1   Survived     577 non-null    int64  
 2   Pclass       577 non-null    int64  
 3   Name         577 non-null    object 
 4   Sex          577 non-null    object 
 5   Age          453 non-null    float64
 6   SibSp        577 non-null    int64  
 7   Parch        577 non-null    int64  
 8   Ticket       577 non-null    object 
 9   Fare         577 non-null    float64
 10  Cabin        107 non-null    object 
 11  Embarked     577 non-null    object 
dtypes: float64(2), int64(5), object(5)
memory usage: 58.6+ KB

titanic["Age"] > 35
(titanic["Age"] > 35).sum()

217

above_35.shape

(217, 12)

남자.shape

(577, 12)

titanic.shape

(891, 12)

891 - 577 # 이렇게 사용하면 결측치가 있는 경우 제대로 나올 수 없기 때문에 꼭 비어있는 값이 있는지 확인해주세요.

314

titanic.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 12 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   PassengerId  891 non-null    int64  
 1   Survived     891 non-null    int64  
 2   Pclass       891 non-null    int64  
 3   Name         891 non-null    object 
 4   Sex          891 non-null    object 
 5   Age          714 non-null    float64
 6   SibSp        891 non-null    int64  
 7   Parch        891 non-null    int64  
 8   Ticket       891 non-null    object 
 9   Fare         891 non-null    float64
 10  Cabin        204 non-null    object 
 11  Embarked     889 non-null    object 
dtypes: float64(2), int64(5), object(5)
memory usage: 83.7+ KB

class_23 = titanic[(titanic["Pclass"] == 2) | (titanic["Pclass"] == 3)]

How do I select specific rows and columns from a DataFrame?

adult_names = titanic.loc[titanic["Age"] > 35, "Name"]
adult_names

1      Cumings, Mrs. John Bradley (Florence Briggs Th...
6                                McCarthy, Mr. Timothy J
11                              Bonnell, Miss. Elizabeth
13                           Andersson, Mr. Anders Johan
15                      Hewlett, Mrs. (Mary D Kingcome) 
                             ...                        
865                             Bystrom, Mrs. (Karolina)
871     Beckwith, Mrs. Richard Leonard (Sallie Monypeny)
873                          Vander Cruyssen, Mr. Victor
879        Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)
885                 Rice, Mrs. William (Margaret Norton)
Name: Name, Length: 217, dtype: object

adult_names = titanic.loc[titanic["Age"] > 35, ["Name", "Sex"]]
adult_names

titanic.iloc[9:25, 2:6]

크롤링 데이터로 웹페이지 만들기

import pandas as pd

# data = pd.read_html('https://ridibooks.com/category/bestsellers/2200')
# data

import requests
from bs4 import BeautifulSoup

url = 'https://ridibooks.com/category/bestsellers/2200' #수정
response = requests.get(url)
response.encoding = 'utf-8'
html = response.text

soup = BeautifulSoup(html, 'html.parser')

bookservices = soup.select('.title_text') #수정
for no, book in enumerate(bookservices, 1):
    print(no, book.text.strip())

1 구글 엔지니어는 이렇게 일한다
2 면접을 위한 CS 전공지식 노트
3 도메인 주도 개발 시작하기
4 개정판｜헤드 퍼스트 디자인 패턴
5 프로그래머의 뇌
6 개정판｜혼자 공부하는 파이썬
7 똑똑한 코드 작성을 위한 실전 알고리즘
8 프로그래머가 알아야 할 알고리즘 40
9 전길남, 연결의 탄생
10 비전공자를 위한 이해할 수 있는 IT 지식
11 적정 소프트웨어 아키텍처
12 이것이 취업을 위한 코딩 테스트다 with 파이썬
13 소문난 명강의_소플의 처음 만난 리액트
14 쉽고 빠른 플러터 앱 개발
15 동시성 프로그래밍
16 유연한 소프트웨어를 만드는 설계 원칙
17 빅데이터 시대, 성과를 이끌어 내는 데이터 문해력
18 Do it! 쉽게 배우는 파이썬 데이터 분석
19 객체지향의 사실과 오해
20 혼자 공부하는 머신러닝+딥러닝

import requests
from bs4 import BeautifulSoup

url = 'https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=1&ie=utf8&query=%EB%B0%95%EC%8A%A4%EC%98%A4%ED%94%BC%EC%8A%A4' #수정
response = requests.get(url)
response.encoding = 'utf-8'
html = response.text

soup = BeautifulSoup(html, 'html.parser')

bookservices = soup.select('.name') #수정
for no, book in enumerate(bookservices, 1):
    print(no, book.text.strip())

1 범죄도시2
2 쥬라기 월드: 도미니언
3 극장판 포켓몬스터DP: 기라...
4 그대가 조국
5 닥터 스트레인지: 대혼돈의 ...
6 카시오페아
7 애프터 양
8 특수요원 빼꼼
9 뜨거운 피: 디 오리지널
10 아치의 노래, 정태춘
11 우연과 상상
12 피는 물보다 진하다
13 킹메이커
14 나를 만나는 길
15 오마주
16 초록물고기
17 올리 마키의 가장 행복한 날
18 더 노비스
19 매스
20 몬스터 싱어: 매직 인 파리
21 봉명주공
22 극장판 주술회전 0
23 배드 가이즈
24 파리, 13구
25 극장판 엉덩이 탐정: 수플레...
26 괴물, 유령, 자유인
27 광대: 소리꾼
28 플레이그라운드
29 안녕하세요
30 리골레토
31 그대가 조국
32 아치의 노래, 정태춘
33 범죄도시2
34 극장판 주술회전 0
35 광대: 소리꾼
36 카시오페아
37 우연과 상상
38 매스
39 안녕하세요
40 배드 가이즈
41 킹메이커
42 오마주
43 애프터 양
44 닥터 스트레인지: 대혼돈의 ...
45 플레이그라운드
46 나를 만나는 길
47 극장판 포켓몬스터DP: 기라...
48 파리, 13구
49 더 노비스
50 쥬라기 월드: 도미니언
51 극장판 엉덩이 탐정: 수플레...
52 리쓰남
53 리쓰남
54 비됴알바
55 비됴알바
56 청우
57 청우
58 그루터기그루터기그루터기그루터...
59 그루터기그루터기그루터기그루터...

import requests
from bs4 import BeautifulSoup

url = 'https://ridibooks.com/category/bestsellers/2200' #수정
response = requests.get(url)
response.encoding = 'utf-8'
html = response.text

soup = BeautifulSoup(html, 'html.parser')

bookservices = soup.select('.thumbnail') #수정
for no, book in enumerate(bookservices, 1):
    print(no, book['alt'], 'https:' + book['data-src'])

1 구글 엔지니어는 이렇게 일한다 https://img.ridicdn.net/cover/443001038/large#1
2 면접을 위한 CS 전공지식 노트 https://img.ridicdn.net/cover/754034561/large#1
3 도메인 주도 개발 시작하기 https://img.ridicdn.net/cover/443001019/large#1
4 개정판｜헤드 퍼스트 디자인 패턴 https://img.ridicdn.net/cover/443001018/large#1
5 프로그래머의 뇌 https://img.ridicdn.net/cover/852001285/large#1
6 개정판｜혼자 공부하는 파이썬 https://img.ridicdn.net/cover/443001043/large#1
7 똑똑한 코드 작성을 위한 실전 알고리즘 https://img.ridicdn.net/cover/443001042/large#1
8 프로그래머가 알아야 할 알고리즘 40 https://img.ridicdn.net/cover/754034732/large#1
9 전길남, 연결의 탄생 https://img.ridicdn.net/cover/1546000957/large#1
10 비전공자를 위한 이해할 수 있는 IT 지식 https://img.ridicdn.net/cover/4489000001/large#1
11 적정 소프트웨어 아키텍처 https://img.ridicdn.net/cover/443001041/large#1
12 이것이 취업을 위한 코딩 테스트다 with 파이썬 https://img.ridicdn.net/cover/443000825/large#1
13 소문난 명강의_소플의 처음 만난 리액트 https://img.ridicdn.net/cover/443001044/large#1
14 쉽고 빠른 플러터 앱 개발 https://img.ridicdn.net/cover/3780000151/large#1
15 동시성 프로그래밍 https://img.ridicdn.net/cover/443001024/large#1
16 유연한 소프트웨어를 만드는 설계 원칙 https://img.ridicdn.net/cover/443001020/large#1
17 빅데이터 시대, 성과를 이끌어 내는 데이터 문해력 https://img.ridicdn.net/cover/3903000029/large#1
18 Do it! 쉽게 배우는 파이썬 데이터 분석 https://img.ridicdn.net/cover/754034726/large#1
19 객체지향의 사실과 오해 https://img.ridicdn.net/cover/1160000033/large#1
20 혼자 공부하는 머신러닝+딥러닝 https://img.ridicdn.net/cover/443000859/large#1

import requests
from bs4 import BeautifulSoup

url = 'https://ridibooks.com/category/bestsellers/2200' #수정
response = requests.get(url)
response.encoding = 'utf-8'
html = response.text

soup = BeautifulSoup(html, 'html.parser')

책순위 = []
책이름 = []
책이미지 = []

bookservices = soup.select('.thumbnail') #수정
for no, book in enumerate(bookservices, 1):
    책순위.append(no)
    책이름.append(book['alt'])
    책이미지.append('https:' + book['data-src'])

책이미지

['https://img.ridicdn.net/cover/443001038/large#1',
 'https://img.ridicdn.net/cover/754034561/large#1',
 'https://img.ridicdn.net/cover/443001019/large#1',
 'https://img.ridicdn.net/cover/443001018/large#1',
 'https://img.ridicdn.net/cover/852001285/large#1',
 'https://img.ridicdn.net/cover/443001043/large#1',
 'https://img.ridicdn.net/cover/443001042/large#1',
 'https://img.ridicdn.net/cover/754034732/large#1',
 'https://img.ridicdn.net/cover/1546000957/large#1',
 'https://img.ridicdn.net/cover/4489000001/large#1',
 'https://img.ridicdn.net/cover/443001041/large#1',
 'https://img.ridicdn.net/cover/443000825/large#1',
 'https://img.ridicdn.net/cover/443001044/large#1',
 'https://img.ridicdn.net/cover/3780000151/large#1',
 'https://img.ridicdn.net/cover/443001024/large#1',
 'https://img.ridicdn.net/cover/443001020/large#1',
 'https://img.ridicdn.net/cover/3903000029/large#1',
 'https://img.ridicdn.net/cover/754034726/large#1',
 'https://img.ridicdn.net/cover/1160000033/large#1',
 'https://img.ridicdn.net/cover/443000859/large#1']

df = pd.DataFrame({
    '책순위' : 책순위,
    '책이름' : 책이름,
    '책이미지' : 책이미지
})
df

df.to_html('index.html')

def 이미지변환(path):
    return f'<img src="{path}" width="60" >'

df.to_html('index.html', escape=False, formatters=dict(책이미지=이미지변환))

How to create plots in pandas?

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('train.csv')
df[['SibSp', 'Parch']].plot()

<matplotlib.axes._subplots.AxesSubplot at 0x7f6f8c096e10>

df.columns

Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',
       'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],
      dtype='object')

df.plot.scatter(x="Age", y="Fare", alpha=0.5)

<matplotlib.axes._subplots.AxesSubplot at 0x7f6f8c000f90>

df[['Age']].plot.box()

<matplotlib.axes._subplots.AxesSubplot at 0x7f6f8bb39590>

How to create new columns derived from existing columns?

df['Family'] = 1 + df['SibSp'] + df['Parch']
df

df['고향'] = '제주'
df

How to calculate summary statistics?

df["Age"].mean() # 평균

29.69911764705882

df[["Age", "Fare"]].median() # 중앙값

Age     28.0000
Fare    14.4542
dtype: float64

df[["Age", "Fare"]].describe() # 일반 통계치

Aggregating statistics grouped by category

df[["Sex", "Age"]].groupby("Sex").mean()

df.groupby("Sex").mean()

df.groupby("Sex")["Age"].mean()

Sex
female    27.915709
male      30.726645
Name: Age, dtype: float64

df["Pclass"].value_counts()

3    491
1    216
2    184
Name: Pclass, dtype: int64

df["Sex"].value_counts()

male      577
female    314
Name: Sex, dtype: int64

How to reshape the layout of tables?

df.sort_values(by="Age").head() # 원본을 변경하지 않고 정렬, 오름차순
df.sort_values(by="Age", ascending=False).head() # 내림차순
df.sort_values(by=['Pclass', 'Age'], ascending=False).head()

여성 = df[df["Sex"] == "female"]
여성.head()

# sort_index는 index로 정렬하는 메서드입니다.
여성.sort_index().groupby(["Age"]).head(5)
여성.sort_index(ascending=False).groupby(["Age"]).head(5)
여성[::-1]
여성[:]

여성.pivot(index="PassengerId", columns="Pclass", values="Fare") # 데이터 재구조화

How to combine data from multiple tables?

data = {
    '수학' : [90, 80],
    '영어' : [70, 60]
}

data2 = {
    '언어' : [20, 70],
    '과학' : [30, 60]
}

data3 = {
    '수학' : [100, 90],
    '영어' : [85, 65]
}

data = pd.DataFrame(data)
data2 = pd.DataFrame(data2)
data3 = pd.DataFrame(data3)

data

data + data2

data + data3

pd.concat([data, data2], axis=0)

data

data['언어'] = data2['언어']
# data['과학'] = data2['과학']
# data[['언어', '과학']] = data2[['언어', '과학']]

data

data['과학'] = data2['과학']

data

pd.concat([data, data2], axis=1)

data = {
    '수학' : [90, 80],
    '영어' : [70, 60]
}

data2 = {
    '언어' : [20, 70],
    '과학' : [30, 60]
}

data3 = {
    '수학' : [100, 90],
    '영어' : [85, 65]
}

data = pd.DataFrame(data)
data2 = pd.DataFrame(data2)
data3 = pd.DataFrame(data3)

pd.concat([data, data2], axis=1)

Join tables using a common identifier

data = {
    '이름' : ['영희', '철수', '호준'],
    '수학' : [70, 60, 90]
}

data2 = {
    '이름' : ['영희', '호준'],
    '과학' : [50, 70],
    '언어' : [90, 60]
}

data = pd.DataFrame(data)
data2 = pd.DataFrame(data2)

data

data2

merge = pd.merge(data, data2, how="left", on="이름")
merge

pandas datetime

df = pd.DataFrame({'year': [2021, 2021],
                   'month': [7, 7],
                   'day': [9, 10]})

df

data = pd.to_datetime(df)
data

0   2021-07-09
1   2021-07-10
dtype: datetime64[ns]

data = pd.to_datetime(df)
data

0   2021-07-09
1   2021-07-10
dtype: datetime64[ns]

data.dt.year

0    2021
1    2021
dtype: int64

data.dt.month

0    7
1    7
dtype: int64

data.dt.day

0     9
1    10
dtype: int64

data.dt.weekday

0    4
1    5
dtype: int64

data.dt.day_name() #Series에서는 day_name(), weekday_name() - 버전업 되면서 삭제됨

0      Friday
1    Saturday
dtype: object

pd.to_datetime('now') # UTC 시간

Timestamp('2022-06-05 08:47:40.106892')

How to manipulate textual data?

df = pd.read_csv('train.csv')
df.head()

df["Name"].str.lower()
df["Name"].str.split(",")
df["Name"].str.contains("Mr")
df["Name"].str.contains("Mr").value_counts()
df[df["Name"].str.contains("Mr")]

df["Sex"].replace({"male": 1, "female": 0})

0      1
1      0
2      0
3      0
4      1
      ..
886    1
887    0
888    0
889    1
890    1
Name: Sex, Length: 891, dtype: int64

	연도 (년)	추계인구(명)	출생자수(명)	사망자수(명)	자연증가수(명)	조출생률 (1000명당)	조사망률 (1000명당)	자연증가율 (1000명당)	합계출산율
0	1925	12997611	558897	359042	199855	43.0	27.6	15.4	6.590
1	1926	13052741	511667	337948	173719	39.2	25.9	13.3	NaN
2	1927	13037169	534524	353818	180706	41.0	27.1	13.9	NaN
3	1928	13105131	566142	357701	208441	43.2	27.3	15.9	NaN
4	1929	13124279	566969	414366	152603	43.2	31.6	11.6	NaN
...	...	...	...	...	...	...	...	...	...
92	2017	51446201	357771	285534	72237	7.0	5.5	1.5	1.052
93	2018	51635256	326822	298820	28002	6.4	5.8	0.6	0.977
94	2019	51709098	303054	295132	7922	5.9	5.7	0.2	0.918
95	2020	51829023	272337	304948	-32611	5.3	5.9	-0.6	0.837
96	2021	51744876	260494	317773	-57280	5.1	6.2	-1.1	0.810

	사망자수(명)
0	359042
1	337948
2	353818
3	357701
4	414366
...	...
92	285534
93	298820
94	295132
95	304948
96	317773

	Age
count	3.000000
mean	38.333333
std	18.230012
min	22.000000
25%	28.500000
50%	35.000000
75%	46.500000
max	58.000000

	PassengerId	Survived	Pclass	Name	Sex	Age	SibSp	Ticket	Fare	Cabin	Embarked
0	1	0	3	Braund, Mr. Owen Harris	male	22.0	1	A/5 21171	7.2500	NaN	S
1	2	1	1	Cumings, Mrs. John Bradley (Florence Briggs Th...	female	38.0	1	PC 17599	71.2833	C85	C
2	3	1	3	Heikkinen, Miss. Laina	female	26.0	0	STON/O2. 3101282	7.9250	NaN	S
3	4	1	1	Futrelle, Mrs. Jacques Heath (Lily May Peel)	female	35.0	1	113803	53.1000	C123	S
4	5	0	3	Allen, Mr. William Henry	male	35.0	0	373450	8.0500	NaN	S

	PassengerId	Survived	Pclass	Name	Sex	Age	SibSp	Parch	Ticket	Fare	Cabin	Embarked
1	2	1	1	Cumings, Mrs. John Bradley (Florence Briggs Th...	female	38.0	1	0	PC 17599	71.2833	C85	C
6	7	0	1	McCarthy, Mr. Timothy J	male	54.0	0	0	17463	51.8625	E46	S
11	12	1	1	Bonnell, Miss. Elizabeth	female	58.0	0	0	113783	26.5500	C103	S
13	14	0	3	Andersson, Mr. Anders Johan	male	39.0	1	5	347082	31.2750	NaN	S
15	16	1	2	Hewlett, Mrs. (Mary D Kingcome)	female	55.0	0	0	248706	16.0000	NaN	S
25	26	1	3	Asplund, Mrs. Carl Oscar (Selma Augusta Emilia...	female	38.0	1	5	347077	31.3875	NaN	S
30	31	0	1	Uruchurtu, Don. Manuel E	male	40.0	0	0	PC 17601	27.7208	NaN	C
33	34	0	2	Wheadon, Mr. Edward H	male	66.0	0	0	C.A. 24579	10.5000	NaN	S
35	36	0	1	Holverson, Mr. Alexander Oskar	male	42.0	1	0	113789	52.0000	NaN	S
40	41	0	3	Ahlin, Mrs. Johan (Johanna Persdotter Larsson)	female	40.0	1	0	7546	9.4750	NaN	S

	PassengerId	Survived	Pclass	Name	Sex	Age	SibSp	Parch	Ticket	Fare	Cabin	Embarked
886	887	0	2	Montvila, Rev. Juozas	male	27.0	0	0	211536	13.00	NaN	S
887	888	1	1	Graham, Miss. Margaret Edith	female	19.0	0	0	112053	30.00	B42	S
888	889	0	3	Johnston, Miss. Catherine Helen "Carrie"	female	NaN	1	2	W./C. 6607	23.45	NaN	S
889	890	1	1	Behr, Mr. Karl Howell	male	26.0	0	0	111369	30.00	C148	C
890	891	0	3	Dooley, Mr. Patrick	male	32.0	0	0	370376	7.75	NaN	Q

	Pclass	Name	Sex	Age
9	2	Nasser, Mrs. Nicholas (Adele Achem)	female	14.0
10	3	Sandstrom, Miss. Marguerite Rut	female	4.0
11	1	Bonnell, Miss. Elizabeth	female	58.0
12	3	Saundercock, Mr. William Henry	male	20.0
13	3	Andersson, Mr. Anders Johan	male	39.0
14	3	Vestrom, Miss. Hulda Amanda Adolfina	female	14.0
15	2	Hewlett, Mrs. (Mary D Kingcome)	female	55.0
16	3	Rice, Master. Eugene	male	2.0
17	2	Williams, Mr. Charles Eugene	male	NaN
18	3	Vander Planke, Mrs. Julius (Emelia Maria Vande...	female	31.0
19	3	Masselmani, Mrs. Fatima	female	NaN
20	2	Fynney, Mr. Joseph J	male	35.0
21	2	Beesley, Mr. Lawrence	male	34.0
22	3	McGowan, Miss. Anna "Annie"	female	15.0
23	1	Sloper, Mr. William Thompson	male	28.0
24	3	Palsson, Miss. Torborg Danira	female	8.0

	책순위	책이름	책이미지
0	1	구글 엔지니어는 이렇게 일한다	https://img.ridicdn.net/cover/443001038/large#1
1	2	면접을 위한 CS 전공지식 노트	https://img.ridicdn.net/cover/754034561/large#1
2	3	도메인 주도 개발 시작하기	https://img.ridicdn.net/cover/443001019/large#1
3	4	개정판｜헤드 퍼스트 디자인 패턴	https://img.ridicdn.net/cover/443001018/large#1
4	5	프로그래머의 뇌	https://img.ridicdn.net/cover/852001285/large#1
5	6	개정판｜혼자 공부하는 파이썬	https://img.ridicdn.net/cover/443001043/large#1
6	7	똑똑한 코드 작성을 위한 실전 알고리즘	https://img.ridicdn.net/cover/443001042/large#1
7	8	프로그래머가 알아야 할 알고리즘 40	https://img.ridicdn.net/cover/754034732/large#1
8	9	전길남, 연결의 탄생	https://img.ridicdn.net/cover/1546000957/large#1
9	10	비전공자를 위한 이해할 수 있는 IT 지식	https://img.ridicdn.net/cover/4489000001/large#1
10	11	적정 소프트웨어 아키텍처	https://img.ridicdn.net/cover/443001041/large#1
11	12	이것이 취업을 위한 코딩 테스트다 with 파이썬	https://img.ridicdn.net/cover/443000825/large#1
12	13	소문난 명강의_소플의 처음 만난 리액트	https://img.ridicdn.net/cover/443001044/large#1
13	14	쉽고 빠른 플러터 앱 개발	https://img.ridicdn.net/cover/3780000151/large#1
14	15	동시성 프로그래밍	https://img.ridicdn.net/cover/443001024/large#1
15	16	유연한 소프트웨어를 만드는 설계 원칙	https://img.ridicdn.net/cover/443001020/large#1
16	17	빅데이터 시대, 성과를 이끌어 내는 데이터 문해력	https://img.ridicdn.net/cover/3903000029/large#1
17	18	Do it! 쉽게 배우는 파이썬 데이터 분석	https://img.ridicdn.net/cover/754034726/large#1
18	19	객체지향의 사실과 오해	https://img.ridicdn.net/cover/1160000033/large#1
19	20	혼자 공부하는 머신러닝+딥러닝	https://img.ridicdn.net/cover/443000859/large#1

	Age	Fare
count	714.000000	891.000000
mean	29.699118	32.204208
std	14.526497	49.693429
min	0.420000	0.000000
25%	20.125000	7.910400
50%	28.000000	14.454200
75%	38.000000	31.000000
max	80.000000	512.329200

	PassengerId	Survived	Pclass	Age	SibSp	Parch	Fare	Family
Sex
female	431.028662	0.742038	2.159236	27.915709	0.694268	0.649682	44.479818	2.343949
male	454.147314	0.188908	2.389948	30.726645	0.429809	0.235702	25.523893	1.665511

	PassengerId	Survived	Pclass	Name	Sex	Age	Ticket	Fare	Cabin	Embarked	Family	고향
851	852	0	3	Svensson, Mr. Johan	male	74.0	347060	7.7750	NaN	S	1	제주
116	117	0	3	Connors, Mr. Patrick	male	70.5	370369	7.7500	NaN	Q	1	제주
280	281	0	3	Duane, Mr. Frank	male	65.0	336439	7.7500	NaN	Q	1	제주
483	484	1	3	Turkula, Mrs. (Hedwig)	female	63.0	4134	9.5875	NaN	S	1	제주
326	327	0	3	Nysveen, Mr. Johan Hansen	male	61.0	345364	6.2375	NaN	S	1	제주

	수학	영어	언어	과학
0	90.0	70.0	NaN	NaN
1	80.0	60.0	NaN	NaN
0	NaN	NaN	20.0	30.0
1	NaN	NaN	70.0	60.0

	수학	영어
0	190	155
1	170	125

	이름	수학
0	영희	70
1	철수	60
2	호준	90

	이름	과학	언어
0	영희	50	90
1	호준	70	60

	year	month	day
0	2021	7	9
1	2021	7	10