English [en], .pdf, 🚀/lgli/lgrs, 26.7MB, 📘 Book (non-fiction), lgrsnf/DuckDB_ Up and Running - Wei-Meng Lee.pdf
DuckDB: Up and Running: Fast Data Analytics and Reporting 🔍
O'Reilly Media, Incorporated, 1, 2025
Wei-Meng Lee 🔍
description
DuckDB, an open source in-process database created for OLAP workloads, provides key advantages over more mainstream OLAP solutions: It's embeddable and optimized for analytics. It also integrates well with Python and is compatible with SQL, giving you the performance and flexibility of SQL right within your Python environment. This handy guide shows you how to get started with this versatile and powerful tool.
Author Wei-Meng Lee takes developers and data professionals through DuckDB's primary features and functions, best practices, and practical examples of how you can use DuckDB for a variety of data analytics tasks. You'll also dive into specific topics, including how to import data into DuckDB, work with tables, perform exploratory data analysis, visualize data, perform spatial analysis, and use DuckDB with JSON files, Polars, and JupySQL. Understand the purpose of DuckDB and its main functions
• Conduct data analytics tasks using DuckDB
• Integrate DuckDB with pandas, Polars, and JupySQL
• Use DuckDB to query your data
• Perform spatial analytics using DuckDB's spatial extension
• Work with a diverse range of data including Parquet, CSV, and JSON
Alternative filename
lgli/DuckDB_ Up and Running - Wei-Meng Lee.pdf
Alternative edition
United States, United States of America
metadata comments
Publisher's PDF
Alternative description
Cover
Copyright
Table of Contents
Preface
Conventions Used in This Book
Using Code Examples
O’Reilly Online Learning
How to Contact Us
Acknowledgements
Chapter 1. Getting Started with DuckDB
Introduction to DuckDB
Why Use DuckDB?
High-Performance Analytical Queries
Versatile Integration and Ease of Use Across Multiple Programming Languages
Open Source
A Quick Look at DuckDB
Loading Data into DuckDB
Inserting a Record
Querying a Table
Performing Aggregation
Joining Tables
Reading Data from pandas
Why DuckDB Is More Efficient
Execution Speed
Memory Usage
Summary
Chapter 2. Importing Data into DuckDB
Creating DuckDB Databases
Loading Data from Different Data Sources and Formats
Working with CSV Files
Working with Parquet Files
Working with Excel Files
Working with MySQL
Summary
Chapter 3. A Primer on SQL
Using the DuckDB CLI
Importing Data into DuckDB
Dot Commands
Persisting the In-Memory Database on Disk
DuckDB SQL Primer
Creating a Database
Creating Tables
Viewing the Schemas of Tables
Dropping a Table
Working with Tables
Populating Tables with Rows
Updating Rows
Deleting Rows
Querying Tables
Joining Tables
Aggregating Data
Analytics
Summary
Chapter 4. Using DuckDB with Polars
Introduction to Polars
Creating a Polars DataFrame
Understanding Lazy Evaluation in Polars
Querying Polars DataFrames Using DuckDB
Using the sql() Function
Using the DuckDBPyRelation Object
Summary
Chapter 5. Performing EDA with DuckDB
Our Dataset: The 2015 Flight Delays Dataset
Geospatial Analysis
Displaying a Map
Displaying All Airports on the Map
Using the spatial Extension in DuckDB
Performing Descriptive Analytics
Finding the Airports for Each State and City
Aggregating the Total Number of Airports in Each State
Obtaining the Flight Counts for Each Pair of Origin and Destination Airports
Getting the Canceled Flights from Airlines
Getting the Flight Count for Each Day of the Week
Finding the Most Common Timeslot for Flight Delays
Finding the Airlines with the Most and Fewest Delays
Summary
Chapter 6. Using DuckDB with JSON Files
Primer on JSON
Object
String
Boolean
Number
Nested Object
Array
null
Loading JSON Files into DuckDB
Using the read_json_auto() Function
Using the read_json() Function
Using the COPY-FROM Statement
Exporting Tables to JSON
Summary
Chapter 7. Using DuckDB with JupySQL
What Is JupySQL?
Installing JupySQL
Loading the sql Extension
Integrating with DuckDB
Performing Queries
Storing Snippets
Visualization
Histograms
Box Plots
Pie Charts
Bar Plots
Integrating with MySQL
Using Environment Variables
Using an .ini File
Using keyring
Summary
Chapter 8. Accessing Remote Data Using DuckDB
DuckDB’s httpfs Extension
Querying CSV and Parquet Files Remotely
Accessing CSV Files
Accessing Parquet Files
Querying Hugging Face Datasets
Using Hugging Face Datasets
Reading the Dataset Using hf:// Paths
Accessing Files Within a Folder
Querying Multiple Files Using the Glob Syntax
Working with Private Hugging Face Datasets
Summary
Chapter 9. Using DuckDB in the Cloud with MotherDuck
Introduction to MotherDuck
Signing Up for MotherDuck
MotherDuck Plans
Getting Started with MotherDuck
Adding Tables
Creating Schemas
Sharing Databases
Creating a Database
Detaching a Database
Using the Databases in MotherDuck
Querying Your Database
Writing SQL Using AI
Using MotherDuck Through the DuckDB CLI
Connecting to MotherDuck
Querying Databases on MotherDuck
Creating Databases on MotherDuck
Performing Hybrid Queries
Summary
Index
About the Author
Colophon
date open sourced
2024-12-17
Read more…

🚀 Fast downloads

Become a member to support the long-term preservation of books, papers, and more. To show our gratitude for your support, you get fast downloads. ❤️

🐢 Slow downloads

From trusted partners. More information in the FAQ. (might require browser verification — unlimited downloads!)

All download options have the same file, and should be safe to use. That said, always be cautious when downloading files from the internet, especially from sites external to Anna’s Archive. For example, be sure to keep your devices updated.
  • For large files, we recommend using a download manager to prevent interruptions.
    Recommended download managers: JDownloader
  • You will need an ebook or PDF reader to open the file, depending on the file format.
    Recommended ebook readers: Anna’s Archive online viewer, ReadEra, and Calibre
  • Use online tools to convert between formats.
    Recommended conversion tools: CloudConvert
  • You can send both PDF and EPUB files to your Kindle or Kobo eReader.
    Recommended tools: Amazon‘s “Send to Kindle” and djazz‘s “Send to Kobo/Kindle”
  • Support authors and libraries
    ✍️ If you like this and can afford it, consider buying the original, or supporting the authors directly.
    📚 If this is available at your local library, consider borrowing it for free there.