'데이터 분석/Python' 카테고리의 글 목록

Notice

Recent Posts

Recent Comments

Link

« 2025/08 »
일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Tags more

Archives

Today

Total

관리 메뉴

목록데이터 분석/Python (72)

라일락 꽃이 피는 날

[Pandas] transform

transform groupby 후 transform 함수를 사용하면 원래의 index를 유지한 상태로 통계 함수를 적용한다. 전체 데이터의 집계가 아닌 각 그룹에서의 집계를 계산한다. 따라서 새로 생성된 데이터를 원본 dataframe과 합치기 쉽다. df.groupby('Pclass').transform(np.mean) df['Age2'] = df.groupby('Pclass').transform(np.mean)['Age'] df df['Age3'] = df.groupby(['Pclass', 'Sex']).transform(np.mean)['Age'] df

데이터 분석/Python 2021. 10. 7. 09:45

[Numpy] linalg

1. np.linalg.inv 역행렬을 구할 때 사용한다. 이때, 모든 차원의 값이 같아야 한다. x = np.random.rand(3, 3) np.linalg.inv(x) 행렬의 곱 (@) x @ np.linalg.inv(x) np.matmul(x, np.linalg.inv(x)) 2. np.linalg.solve Ax = B 형태의 선형대수식 솔루션을 제공한다. A = np.array([[1, 1], [2, 4]]) B = np.array([25, 64]) x = np.linalg.solve(A, B) # [18. 7.] np.allclose(A@x, B) # True

데이터 분석/Python 2021. 9. 8. 03:20

[Numpy] Boolean indexing

Boolean indexing ndarry 인덱싱 시, bool 리스트를 전달하여 True인 경우만 필터링하여 반환한다. x = np.random.randint(1, 100, size=10) # [75 12 80 63 69 82 24 35 92 22] x[x % 2 == 0] # array([12, 80, 82, 24, 92, 22]) x[x 50)] # array([75, 12, 80, 63, 69, 82, 24, 92, 22])

데이터 분석/Python 2021. 9. 8. 03:15

[Numpy] ravel, flatten

1. ravel 다차원 배열을 1차원으로 변경한다. order='C' (row 우선 변경) / 'F' (column 우선 변경) x = np.arange(15).reshape(3, 5) np.ravel(x) # array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]) np.ravel(x, order='C') # array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]) np.ravel(x, order='F') # array([ 0, 5, 10, 1, 6, 11, 2, 7, 12, 3, 8, 13, 4, 9, 14]) 2. flatten 다차원 배열을 1차원으로 변경한다. ravel과 다르게 원본 데이터가 아닌 ..

데이터 분석/Python 2021. 9. 7. 23:48

[Numpy] 삼각함수 (sin, cos, tan)

1. sin (사인) import numpy as np import matplotlib.pylab as plt x = np.linspace(-5, 5, 100) sin = np.sin(x) plt.plot(x, sin) plt.title('sin(x)') plt.show() 2. cos (코사인) x = np.linspace(-5, 5, 100) cos = np.cos(x) plt.plot(x, cos) plt.title('cos(x)') plt.show() 3. tan (탄젠트) x = np.linspace(-3, 3, 100) tan = np.tan(x) plt.plot(x, tan) plt.ylim([-10, 10]) plt.title('tan(x)') plt.show()

데이터 분석/Python 2021. 6. 16. 19:02

[Numpy] ndarray 생성

1. zeros 0으로 채워진 지정된 모양과 유형의 새로운 배열을 반환한다. numpy.zeros(shape, dtype) shape : 반환할 배열의 모양 dtype : 반환할 데이터 유형 np.zeros(5) # array([ 0., 0., 0., 0., 0.]) np.zeros((5,), dtype=int) # array([0, 0, 0, 0, 0]) np.zeros((2, 1)) # array([[ 0.], # [ 0.]]) 2. ones 1로 채워진 지정된 모양과 유형의 새로운 배열을 반환한다. numpy.ones(shape, dtype) shape : 반환할 배열의 모양 dtype : 반환할 데이터 유형 np.ones(5) # array([1., 1., 1., 1., 1.]) np.ones((..

데이터 분석/Python 2021. 6. 16. 18:45

[Numpy] 기본 함수

1. 연산 함수 add(덧셈), subtract(뺄셈), multiply(곱셈), divide(나눗셈) x = np.array([1, 2, 3]) y = np.array([4, 5, 6]) np.add(x, y) # array([5, 7, 9]) np.subtract(x, y) # array([-3, -3, -3]) np.multiply(x, y) # array([ 4, 10, 18]) np.divide(x, y) # array([0.25, 0.4 , 0.5 ]) 2. 통계 함수 min(최솟값), max(최댓값), argmin(최솟값의 인덱스), argmax(최댓값의 인덱스), mean(평균), median(중앙값), var(분산), std(표준편차) x = np.array([1, 2, 3, 4, 5, ..

데이터 분석/Python 2021. 6. 16. 18:33

[Pandas] melt

melt 기준이 되는 변수를 선택해서 지정하고, 그 변수를 기준으로 컬럼을 행으로 재구조화시키는 함수다. pandas.melt(frame, id_vars=[], value_vars=[]) id_vars : 기준이 되는 변수 value_vars : 행으로 대입할 변수들 https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.melt.html pandas.melt — pandas 1.2.4 documentation If True, original index is ignored. If False, the original index is retained. Index labels will be repeated as necessary. pandas.py..

데이터 분석/Python 2021. 6. 14. 18:29

이전 Prev 1 2 3 4 ··· 9 Next 다음

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

라일락 꽃이 피는 날

목록데이터 분석/Python (72)

라일락 꽃이 피는 날

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역