Pandas is powerful analytic tools, but took long time running on extensive datasets.
PySpark is an python API for Spark architecture computing.
Enable python users to write pandas analytic functions while benefit from Spark speed up computing.
Pandas is powerful analytic tools, but took long time running on extensive datasets.
PySpark is an python API for Spark architecture computing.
Enable python users to write pandas analytic functions while benefit from Spark speed up computing.
運用python flask架設輕便網頁伺服器, 並用 API 進行串接.
架設web server:
from flask import Flask # declare Flask object
app = Flask(__name__) # initialize Flask object
@app route(‘/’) 建立主網域下的 ’/’ 網址. 如: http://www.Lawrence.com/
def function(): # 在 app route之後定義的函數, 當該網域app被呼叫時就執行此function
return (‘Welcome to your first website!’)
執行 web server:
在 python中執行 app.run()
或在cmd中先設定FLASK_APP環境參數為py檔名稱
set FLASK_APP=程式碼名稱.py
並執行
flask run
即會在網域內打開 web server 觀看內容.
(如: http://131.111.111.111:3000/)
Free deep learning and relating course: fastai
Free real world dataset for research usage: Kaggle
Statistics:
Book: Introduction to statistical learning http://faculty.marshall.usc.edu/gareth-james/
Basic data manipulation and machine learning guide:
Book: Python data science handbook
Use stackoverflow survey data to comprehend data analysis and build predictive model to explore data insights.
In this project, we aimed to deploy the Stackoverflow 2020 developer survey dataset to explore the following questions:
5. What is the mean time spent for developers to become the highest degree?
6. How often does most developers spend on learning?
Dataset:
The dataset used in this project is from Stackoverflow annual developer survey host in 2020: https://drive.google.com/file/d/1dfGerWeWkcyQ9GX9x20rdSGj7WtEpzBB/view?usp=sharing
In sum 64461 samples, containing 61 different questions relating personal features and developer questions.
create text file in directory:
with open(‘directory/text.txt’, ‘w+’) as file:
write text to the file:
file.write(‘I am a good person’)
Plot and visualization:
在matplotlib中印中文:
3. 在python程式碼中加入
plt.rcParams[‘font.sans-serif’] = [‘Microsoft JhengHei’]
plt.rcParams[‘axes.unicode_minus’] = False
物件偵測的label樣式:
Object detection model(Faster RCNN):
PascalVOC type:
-<object>
<name>missing</name>
<pose>Unspecified</pose>
<truncated>1</truncated>
<difficult>0</difficult>
-<bndbox>
<xmin>1459</xmin>
<ymin>2</ymin>
<xmax>1900</xmax>
<ymax>46</ymax>
</bndbox>
</object>
YOLO: [class_id, object_x, object_y, object_width, object_height]
1 0.876842 0.023370 0.246316 0.044565
In anaconda, we create virtual environment to build specific dependencies for different projects.
Create environment:
conda env create — — name myenv
Activate environment:
conda activate myenv
Deactivate environment:
decativate
Show all environment:
conda env list
Delete environment:
conda env remove — — name myenv
conda
Batch normalization:
Feature scaling in hidden layer: During the feed forward, change in 1 layer may influence the 2 layer. …
Windows command line:
加路徑到環境變數:
In current session:
set PATH=directory;%PATH%
In permanent session:
setx PATH “directory;%PATH%”
To view the system path:
echo %PATH%
unzip file:
gunzip file_name
Copy file to directory:
cp file_name target_directory
Show all file summary in current path:
ll
Create new file folder in directory: (also available in windows)
mkdir folder_name
Communication with remote server:
via Putty
Connection: type the IP and connect -> enter user account and pwd
Upload files: