250x250
반응형
Notice
Recent Posts
Recent Comments
Link
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 |
8 | 9 | 10 | 11 | 12 | 13 | 14 |
15 | 16 | 17 | 18 | 19 | 20 | 21 |
22 | 23 | 24 | 25 | 26 | 27 | 28 |
29 | 30 | 31 |
Tags
- flask
- subdag
- airflow subdag
- XAI
- GCP
- Airflow
- 상관관계
- Counterfactual Explanations
- correlation
- BigQuery
- GenericGBQException
- hadoop
- API Gateway
- login crawling
- 공분산
- spark udf
- API
- chatGPT
- gather_nd
- requests
- 유튜브 API
- UDF
- TensorFlow
- youtube data
- session 유지
- grad-cam
- top_k
- integrated gradient
- tensorflow text
- Retry
Archives
- Today
- Total
데이터과학 삼학년
Text classification using CloudML (jupyter notebook with tf.keras) 본문
Natural Language Processing
Text classification using CloudML (jupyter notebook with tf.keras)
Dan-k 2020. 6. 29. 14:14반응형
PROJECT_ID = "project"
BUCKET_NAME = "text"
REGION = "us-central1"
! gsutil ls -al gs://$BUCKET_NAME
!gcloud config set project $PROJECT_ID
!gcloud config set compute/region $REGION
MODEL_NAME = 'text_practice_model'
VERSION_NAME = 'v1'
Train
!gcloud ai-platform jobs submit training text_practice_model_20200629 \
--job-dir gs://daehwan/text/model/text_practice_model_20200629 \
--module-name trainer.task \
--package-path ./trainer \
--region us-central1 \
--python-version 3.7 \
--runtime-version 2.1 \
--stream-logs \
-- \
--model-name='RNN' \
--optimizer='Adam' \
--learning-rate=0.001 \
--embed-dim=32 \
--n-classes=2 \
--train-files=gs://text/movie_train.csv \
--pred-files=gs://text/movie_predict.csv \
--pred-sequence-files=gs://text/preprocess/movie_predict_sequence.json \
--num-epoch=5 \
--batch-size=128
Predict
Online Predict
모델 버전등록
!gcloud ai-platform models create text_practice_model \
--regions us-central1
# Create model version based on that SavedModel directory
!gcloud beta ai-platform versions create v1 \
--model text_practice_model \
--runtime-version 2.1 \
--python-version 3.7 \
--framework tensorflow \
--origin gs://text/model/text_practice_model_20200629/keras_export
DATA for Online Predict
import json
import pandas as pd
dat= 'gs://text/preprocess/movie_predict_sequence.csv'
pred_df = pd.read_csv(dat)
print(pred_df.head())
prediction_input = pd.DataFrame(pred_df).sample(20)
with open('prediction_input.json', 'w') as json_file:
for row in prediction_input.values.tolist():
json.dump(row, json_file)
json_file.write('\n')
## Online predict (local)
!gcloud ai-platform predict \
--model text_practice_model \
--version v1 \
--json-instances prediction_input.json
Batch Predict
- keras model이 csv 파일을 제대로 못읽는다.
- json 파일 형식으로 변경해서 predict 실행
!gcloud ai-platform jobs submit prediction predict_text_model_pactice_20200629 \
--model-dir 'gs://text/model/text_practice_model_20200629/keras_export' \
--runtime-version 2.1 \
--data-format text \
--region us-central1 \
--input-paths 'gs://text/preprocess/movie_predict_sequence.json' \
--output-path 'gs://text/predict/practice_output'
### Wait predict job done
!gcloud ai-platform jobs stream-logs predict_text_model_pactice_20200629
- online prediction 에서는 차원이 맞지 않아도 예측결과가 나옴
- batch prediction에서는 차원이 정확히 일치해야함
- keras custom model 예측시 csv 파일을 못받음 --> json파일로 변경 후 예측 가능tf.data.TextLineDataset(file_paths)
load predict result to BQ
!bq load --project_id=pro \
--autodetect \
--replace=false \
--source_format=NEWLINE_DELIMITED_JSON \
pan.text_clf_predict_result_20200626 \
gs://text/predict/practice_output/prediction.results-*
check result
%load_ext google.cloud.bigquery
%%bigquery result
SELECT dense[offset(0)] as negat, dense[offset(1)] as posit
FROM `project.dataset.text_clf_predict_result_20200626` LIMIT 1000
result
728x90
반응형
LIST
'Natural Language Processing' 카테고리의 다른 글
[TF 2.x] model layer에 text vectorization 단계를 넣기 (0) | 2020.07.15 |
---|---|
[tf 2.x] tf.keras 로 predict 결과 custom 하기 --> GCP ai-platform ( keyed model, serving_signature) (1) | 2020.07.08 |
bi-directional 어텐션 메카니즘 vs bi-directional 모델 (네이버 영화리뷰) (0) | 2020.06.23 |
Word Embedding (0) | 2020.06.17 |
tf.keras (2.0) & soynlp를 이용한 텍스트 분류 (DNN, RNN, CNN) (0) | 2020.06.12 |
Comments