📔
intro-to-python
  • An Introduction to Programming in Python (for Business Students)
  • exercises
    • Data Flow Diagramming Exercise
    • Developer Collaboration Exercise
    • README
    • "Web App" Exercise
      • checkpoints
        • Checkpoint 5: Bootstrap Layout
        • Checkpoint 4: Submitting Data from Web Forms
        • Checkpoint 3: Rendering HTML Pages
        • Checkpoint 1: Routing
        • Checkpoint 2: Modular Organization
      • "Web App" Exercise - Further Exploration
    • hello-world
      • "Hello World (Local)" Exercise
      • "Hello World (Local w/ Version Control)" Exercise
      • "Hello World (Colab)" Exercise
    • "Interface Capabilities" Exercise
    • "Continuous Integration 1, 2, 3" Exercise
    • "Web Service" Exercise
      • "Web Service" Exercise - Further Exploration
    • "Testing 1, 2, 3" Exercise
    • "Command-line Computing" Exercise
      • "Command-line Computing" Exercise
      • Professor Rossetti's Mac Terminal Configuration
      • Command-line Computing Exercise
    • "Codebase Cleanup" Assignment
    • "List Comprehensions" Exercise
    • "Groceries" Exercise
      • Python Datatypes (a.k.a. "Groceries") Exercise
      • Python Datatypes (a.k.a. "Groceries") Exercise
    • "Rock, Paper, Scissors" Exercise
      • "Rock, Paper, Scissors" Exercise
    • README
    • "Monthly Sales Predictions" Exercise
    • Setting up your Local Development Environment
    • "Chart Gallery" Exercise
    • "Run the App" Exercise
    • "Web Requests" Exercise
    • "API Client" Exercise
    • "Custom Functions" Exercise
    • Process Diagramming Exercise
  • notes
    • python
      • packages
        • The bigquery Package
        • The PySimpleGUI Package
        • The dotenv Package
        • The matplotlib Package
        • The requests Package
        • The altair Package
        • The gspread Package
        • The PyMySQL Package
        • The psycopg2 Package
        • The selenium Package
        • The seaborn Package
        • The pytest Package
        • The SpeechRecognition Package
        • The flask Package
        • The pandas Package
        • The spotipy Package
        • The pipenv Package
        • The nltk Package
        • The sqlalchemy Package
        • The pymongo Package
        • The plotly Package
        • The BeautifulSoup Package
        • The sendgrid Package
        • The fpdf Package
        • The autopep8 Package
        • The tweepy Package
        • The twilio Package
        • The tkinter Package
      • Python Datatypes Overview
        • Numbers
        • Classes
        • Dates and Times
        • Strings
        • None
        • Dictionaries
        • Booleans
        • Lists
        • Class Inheritance
      • Control Flow
      • Python Modules
        • The webbrowser Module
        • The time Module
        • The csv Module
        • The sqlite3 Module
        • The itertools Module
        • The json Module
        • The math Module
        • The os Module
        • The statistics Module
        • The random Module
        • The pprint Module
        • The datetime Module
        • The collections Module
      • Printing and Logging
      • Comments
      • Syntax and Style
      • Functions
      • Variables
      • Errors
      • Docstrings
      • File Management
      • User Inputs
      • Debugging
    • clis
      • The git Utility
      • Heroku, and the heroku Utility
      • Anaconda
      • The chromedriver Utility
      • The brew Utility (Mac OS)
      • The pdftotext Utility
      • The python Utility
      • The pip Utility
    • Software
      • Software Licensing
      • Software Documentation
      • Software Ethics
      • Software Testing Overview
      • Application Programming Interfaces (APIs)
      • Software Version Control
      • Software Refactoring Overview
    • devtools
      • The VS Code Text Editor
      • Code Climate
      • Travis CI
      • GitHub Desktop Software
      • Git Bash
      • Google Colab
    • Information Systems
      • Computer Networks
      • Processes
      • Datastores
      • Information Security and Privacy
      • People
    • Technology Project Management
      • Project Management Tools and Techniques
      • The Systems Development Lifecycle (SDLC)
    • hardware
      • Servers
    • Environment Variables
  • projects
    • "Executive Dashboard" Project
      • testing
      • "Exec Dash" Further Exploration Challenges
    • The Self-Directed (a.k.a "Freestyle") Project
      • "Freestyle" Project - Demonstration
      • "Freestyle" Project - Implementation (TECH 2335 Version)
      • "Freestyle" Project - Implementation
      • "Freestyle" Project Proposal
      • plan
    • "Robo Advisor" Project
      • Robo Advisor Project - Automated Testing Challenges
      • "Robo Advisor" Further Exploration Challenges
    • "Shopping Cart" Project
      • "Shopping Cart" Project - Automated Testing Challenges
      • "Shopping Cart" Further Exploration Challenges
      • "Shopping Cart" Project Checkpoints
  • License
  • Exam Prep
  • units
    • Unit 4B: User Interfaces and Experiences (Bonus Material)
    • Unit 5b: Databases and Datastores
    • Module 1 Review
    • Unit 7b: Processing Data from the Internet (Bonus Material)
    • Unit 9: Software Products and Services
    • Unit 8: Software Maintenance and Quality Control
    • Unit 7: Processing Data from the Internet
    • Unit 6: Data Visualization
    • Unit 5: Processing CSV Data
    • Unit 4: User Interfaces and Experiences
    • Unit 3: Python Datatypes
    • Unit 12: Project Presentations
    • Unit 2: Python Language Overview
    • Unit 11: Project Implementation Sprint
    • Unit 1: The Python Development Environment
    • Unit 10: Software Planning, Analysis, and Design
    • Unit 0: Onboarding
    • Unit 5B: Advanced Data Analytics
  • Contributor's Guide
Powered by GitBook
On this page
  • Installation
  • Usage

Was this helpful?

  1. notes
  2. python
  3. packages

The bigquery Package

PreviouspackagesNextThe PySimpleGUI Package

Last updated 4 years ago

Was this helpful?

For interfacing with Google BigQuery databases.

Serverless, highly scalable, and cost-effective cloud data warehouse designed for business agility. -

References:

Installation

pip install google-cloud-bigquery

Usage

To setup this example, gain access to a Google BigQuery database, download the corresponding "google-credentials.json" file into your local repository, ignore it from version control, then set the path to that file as the GOOGLE_APPLICATION_CREDENTIALS environment variable. Also set the BIGQUERY_PROJECT_NAME and BIGQUERY_DATASET_NAME environment variables to reference your project.

import os
from datetime import datetime

from google.cloud import bigquery
from dotenv import load_dotenv

load_dotenv()

GOOGLE_APPLICATION_CREDENTIALS = os.getenv("GOOGLE_APPLICATION_CREDENTIALS") # implicit check by google.cloud
PROJECT_NAME = os.getenv("BIGQUERY_PROJECT_NAME", default="my-project")
DATASET_NAME = os.getenv("BIGQUERY_DATASET_NAME", default="my_database")

# CONNECT TO DATABASE

client = bigquery.Client()

# CREATING TABLES

dataset_address = f"{PROJECT_NAME}.{DATASET_NAME}"
table_address = f"{dataset_address}.my_table_123"

sql = f"""
    CREATE TABLE IF NOT EXISTS `{table_address}` (
        user_id STRING,
        screen_name STRING,
        friend_count INT64,
        friend_names ARRAY<STRING>
    );
"""
client.query(sql)

# INSERTS

my_table = client.get_table(table_address) # an API call (caches results for subsequent inserts)
rows_to_insert = [
    ["id1", "screen_name1", 2, ["friend1", "friend2"]],
    ["id2", "screen_name2", 2, ["friend3", "friend4"]],
]
errors = client.insert_rows(my_table, rows_to_insert)
print(errors)

# QUERYING / FETCHING RESULTS

sql = f"SELECT * FROM `{table_address}`;"
job = client.query(sql)
results = list(job.result())
for row in results:
    print(row)
    print("---")

# QUERYING / FETCHING RESULTS IN BATCHES

sql = f"SELECT * FROM `{table_address}`"

job_name = datetime.now().strftime('%Y_%m_%d_%H_%M_%S') # unique for each job
job_destination_address = f"{dataset_address}.job_{job_name}"
job_config = bigquery.QueryJobConfig(
    priority=bigquery.QueryPriority.BATCH,
    allow_large_results=True,
    destination=job_destination_address
)

job = client.query(sql, job_config=job_config)
print("JOB (FETCH USER FRIENDS):", type(job), job.job_id, job.state, job.location)
for row in job:
    print(row)
BigQuery website
https://cloud.google.com/bigquery/docs/reference/standard-sql/operators
https://cloud.google.com/bigquery/docs/reference/standard-sql/conversion_rules
https://cloud.google.com/dataprep/docs/html/DATEDIF-Function_57344707
https://towardsdatascience.com/google-bigquery-sql-dates-and-times-cheat-sheet-805b5502c7f0
https://cloud.google.com/bigquery/docs/reference/standard-sql/timestamp_functions
https://cloud.google.com/bigquery/docs/running-queries#batch
https://cloud.google.com/bigquery/docs/paging-results