Transfer files to bucket and bq shell commands to create, load, query tables

Please find below steps to transfer files to bucket and creating, loading, querying tables in bq shell command:

1. Download and Install google cloud SDK for windows from link: https://cloud.google.com/sdk/docs/quickstart-windows

2. Once the installation is complete, a link will be opened automatically in the browser.
By providing Y as option in suggested prompt. Screen-shot will be as below:



3. Authenticate by providing the respective login details of GCP.
4. Google Cloud SDK Shell will be opened automatically or open it manually.
5. Go to Cloud storage dashboard and select -->resources-->cloud storage-->Create Bucket(if not created
6. Come to command line and
Please use below command to copy file from your local system to bucket in Google cloud.
gsutil cp [Local_file_path] [google_bucket]
Example: gsutil cp C:/Users/Admin/Downloads/2009.csv gs://gspanntst

7. Go big query shell using below command in sdk
bq shell
8. Run below command to create a sample table
bq mk -t new_dataset.newtable name:integer,value:string
Example:
mk -t gspann_tst.test2 Year:string,Month:string,DayofMonth:string,DayOfWeek:string,DepTime:string,CRSDepTime:string,ArrTime:string,CRSArrTime:string,UniqueCarrier:string,FlightNum:string,TailNum:string,ActualElapsedTime:string,CRSElapsedTime:string,AirTime:string,ArrDelay:string,DepDelay:string,Origin:string,Dest:string,Distance:string,TaxiIn:string,TaxiOut:string,Cancelled:string,CancellationCode:string,Diverted:string,CarrierDelay:string,WeatherDelay:string,NASDelay:string,SecurityDelay:string,LateAircraftDelay:string
Load the data:
load <destination_table> <source> [<schema>]

9. Example:
  load virtual-firefly-180207:gspann_tst.test2 gs://gspanntst/2008/2008.csv



To upload 660 mb file its took 89 secods in bigquery table

10. For small table load:
Example:
 gsutil cp E:/2009.csv gs://gspanntst

  
11. Go to bq shell and create a sample table:
bq shell

mk -t gspann_tst.test_small Year:string,Month:string,DayofMonth:string,DayOfWeek:string,DepTime:string,CRSDepTime:string,ArrTime:string,CRSArrTime:string,UniqueCarrier:string,FlightNum:string,TailNum:string,ActualElapsedTime:string,CRSElapsedTime:string,AirTime:string,ArrDelay:string,DepDelay:string,Origin:string,Dest:string,Distance:string,TaxiIn:string,TaxiOut:string,Cancelled:string,CancellationCode:string,Diverted:string,CarrierDelay:string,WeatherDelay:string,NASDelay:string,SecurityDelay:string,LateAircraftDelay:string



12. Load the data:
load <destination_table> <source> [<schema>]

Example:
 load virtual-firefly-180207:gspann_tst.test_small gs://gspanntst/E:/2009.csv

We can see 220 kb file data has been loaded within no time

13. Select command in bq shell:

query ‘select count(*) from gspann_tst.test2’


With where clause:
14. SELECT * FROM [virtual-firefly-180207:gspann_tst.test_small]  where FlightNum='2195'

No comments: