How To Load Data From Postgres to Databricks Lakehouse in Minutes Using Airbyte
Автор: Airbyte
Загружено: 2022-09-01
Просмотров: 5168
Описание:
In this video, we're gonna set up a connector to send data from a Postgres server into Databricks Lakehouse using Airbyte! From there, you can run queries in Databricks to create simple BI dashboards, extensive analytics, and even Python or machine learning logic!
Docs for S3 Bucket and Databricks cluster permissions: https://docs.databricks.com/administr...
Blog Post on this tutorial: https://airbyte.com/tutorials/load-da...
Setting up Postgres server with docker:
docker run --rm --name airbyte-source -e POSTGRES_PASSWORD=password -p 2000:5432 -d postgres
Adding tables and rows:
docker exec -it airbyte-source psql -U postgres -c "CREATE TABLE users(id SERIAL PRIMARY KEY, col1 VARCHAR(200));"
docker exec -it airbyte-source psql -U postgres -c "INSERT INTO public.users(col1) VALUES('record1');"
docker exec -it airbyte-source psql -U postgres -c "INSERT INTO public.users(col1) VALUES('record2');"
docker exec -it airbyte-source psql -U postgres -c "INSERT INTO public.users(col1) VALUES('record3');"
docker exec -it airbyte-source psql -U postgres -c "CREATE TABLE cities(city_code VARCHAR(8), city VARCHAR(200));"
docker exec -it airbyte-source psql -U postgres -c "INSERT INTO public.cities(city_code, city) VALUES('BCN', 'Barcelona');"
docker exec -it airbyte-source psql -U postgres -c "INSERT INTO public.cities(city_code, city) VALUES('MAD', 'Madrid');"
docker exec -it airbyte-source psql -U postgres -c "INSERT INTO public.cities(city_code, city) VALUES('VAL', 'Valencia');"
Listing directories in Databricks Notebook:
dbutils.fs.ls("s3a://jchau31/data_sync/test/public")
Reading and Loading data:
df = spark.read.load("s3a://{name_of_bucket}/data_sync/test/public/cities")
display(df)
Subscribe to our newsletter: https://airbyte.com/newsletter?utm_so...
Learn more about Airbyte: https://airbyte.com
Повторяем попытку...

Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: