Skip to main content
Version: 1.2.0 (latest)

Facebook ads

Need help deploying these sources or figuring out how to run them in your data stack?
Join our Slack community or Book a call with a dltHub Solutions Engineer.

Facebook Ads is the advertising platform that lets businesses and individuals create targeted ads on Facebook and its affiliated apps like Instagram and Messenger.

This Facebook dlt verified source and pipeline example loads data using the Facebook Marketing API to the destination of your choice.

The endpoints that this verified source supports are:

NameDescription
campaignsA structured marketing initiative that focuses on a specific objective or goal
ad_setsA subset or group of ads within a campaign
adsAn individual advertisement that is created and displayed within an ad set
creativesVisual and textual elements that make up an advertisement
ad_leadsInformation collected from users who have interacted with lead generation ads
facebook_insightsData on audience demographics, post reach, and engagement metrics

To get a complete list of sub-endpoints that can be loaded, see facebook_ads/settings.py.

Setup guide

Grab credentials

Grab Account ID

  1. Ensure that you have Ads Manager active for your Facebook account.
  2. Find your account ID, which is a long number. You can locate it by clicking on the Account Overview dropdown in Ads Manager or by checking the link address. For example, https://adsmanager.facebook.com/adsmanager/manage/accounts?act=10150974068878324.
  3. Note this account ID as it will further be used in configuring dlt.

Grab Access_Token

  1. Sign up for a developer account on developers.facebook.com.
  2. Log in to your developer account and click on "My Apps" in the top right corner.
  3. Create an app, select "Other" as the type, choose "Business" as the category, and click "Next".
  4. Enter the name of your app and select the associated business manager account.
  5. Go to the "Basic" settings in the left-hand side menu.
  6. Copy the "App ID" and "App secret" and paste them as "client_id" and "client_secret" in the secrets.toml file in the .dlt folder.
  7. Next, obtain a short-lived access token at https://developers.facebook.com/tools/explorer/.
  8. Select the created app, add "ads_read" and "lead_retrieval" permissions, and generate a short-lived access token.
  9. Copy the access token and update it in the .dlt/secrets.toml file.

Exchange short-lived token for a long-lived token

By default, Facebook access tokens have a short lifespan of one hour. To exchange a short-lived Facebook access token for a long-lived token, update the .dlt/secrets.toml with client_id and client_secret, and execute the provided Python code.

from facebook_ads import get_long_lived_token
print(get_long_lived_token("your short-lived token"))

Replace the access_token in the .dlt/secrets.toml file with the long-lived token obtained from the above code snippet.

To retrieve the expiry date and the associated scopes of the token, you can use the following command:

from facebook_ads import debug_access_token
debug_access_token()

We highly recommend you add the token expiration timestamp to get notified a week before token expiration that you need to rotate it. Right now, the notifications are sent to the logger with error level. In config.toml / secrets.toml:

[sources.facebook_ads]
access_token_expires_at=1688821881

Note: The Facebook UI, which is described here, might change. The full guide is available at this link.

Initialize the verified source

To get started with your data pipeline, follow these steps:

  1. Enter the following command:

    dlt init facebook_ads duckdb

    This command will initialize the pipeline example with Facebook Ads as the source and duckdb as the destination.

  2. If you'd like to use a different destination, simply replace duckdb with the name of your preferred destination.

  3. After running this command, a new directory will be created with the necessary files and configuration settings to get started.

For more information, read the guide on how to add a verified source.

Add credential

  1. Inside the .dlt folder, you'll find a file called secrets.toml, which is where you can securely store your access tokens and other sensitive information. It's important to handle this file with care and keep it safe. Here's what the file looks like:

    # put your secret values and credentials here
    # do not share this file and do not push it to github
    [sources.facebook_ads]
    access_token="set me up!"
  2. Replace the access_token value with the previously copied one to ensure secure access to your Facebook Ads resources.

  3. Next, follow the destination documentation instructions to add credentials for your chosen destination, ensuring proper routing of your data to the final destination.

  4. It is strongly recommended to add the token expiration timestamp to your config.toml or secrets.toml file.

  5. Next, store your pipeline configuration details in the .dlt/config.toml.

    Here's what the config.toml looks like:

    [sources.facebook_ads]
    account_id = "Please set me up!"
  6. Replace the value of the "account id" with the one copied above.

For more information, read the General Usage: Credentials.

Run the pipeline

  1. Before running the pipeline, ensure that you have installed all the necessary dependencies by running the command:
    pip install -r requirements.txt
  2. You're now ready to run the pipeline! To get started, run the following command:
    python facebook_ads_pipeline.py
  3. Once the pipeline has finished running, you can verify that everything loaded correctly by using the following command:
    dlt pipeline <pipeline_name> show
    For example, the pipeline_name for the above pipeline example is facebook_ads. You may also use any custom name instead.

For more information, read the guide on how to run a pipeline.

Sources and resources

dlt works on the principle of sources and resources.

Default endpoints

You can write your own pipelines to load data to a destination using this verified source. However, it is important to note the complete list of the default endpoints given in facebook_ads/settings.py.

Source facebook_ads_source

This function returns a list of resources to load campaigns, ad sets, ads, creatives, and ad leads data from the Facebook Marketing API.

@dlt.source(name="facebook_ads")
def facebook_ads_source(
account_id: str = dlt.config.value,
access_token: str = dlt.secrets.value,
chunk_size: int = 50,
request_timeout: float = 300.0,
app_api_version: str = None,
) -> Sequence[DltResource]:
...

account_id: Account ID associated with the ad manager, configured in "config.toml".

access_token: Access token associated with the Business Facebook App, configured in "secrets.toml".

chunk_size: The size of the page and batch request. You may need to decrease it if you request a lot of fields. Defaults to 50.

request_timeout: Connection timeout. Defaults to 300.0.

app_api_version: A version of the Facebook API required by the app for which the access tokens were issued, e.g., 'v17.0'. Defaults to the facebook_business library default version.

Resource ads

The ads function fetches ad data. It retrieves ads from a specified account with specific fields and states.

@dlt.resource(primary_key="id", write_disposition="replace")
def ads(
fields: Sequence[str] = DEFAULT_AD_FIELDS,
states: Sequence[str] = None,
) -> Iterator[TDataItems]:

yield get_data_chunked(account.get_ads, fields, states, chunk_size)

fields: Retrieves fields for each ad. For example, “id”, “name”, “adset_id”, etc.

states: The possible states include "Active," "Paused," "Pending Review," "Disapproved," "Completed," and "Archived."

Resources for facebook_ads_source

Similar to resource ads, the following resources have been defined in the __init__.py for source facebook_ads_source:

ResourceDescription
campaignsFetches all DEFAULT_CAMPAIGN_FIELDS
ad_setsFetches all DEFAULT_ADSET_FIELDS
leadsFetches all DEFAULT_LEAD_FIELDS, uses @dlt.transformer decorator
ad_creativesFetches all DEFAULT_ADCREATIVE_FIELDS

The default fields are defined in facebook_ads/settings.py

Source facebook_insights_source

This function returns a list of resources to load facebook_insights.

@dlt.source(name="facebook_ads")
def facebook_insights_source(
account_id: str = dlt.config.value,
access_token: str = dlt.secrets.value,
initial_load_past_days: int = 30,
fields: Sequence[str] = DEFAULT_INSIGHT_FIELDS,
attribution_window_days_lag: int = 7,
time_increment_days: int = 1,
breakdowns: TInsightsBreakdownOptions = "ads_insights_age_and_gender",
action_breakdowns: Sequence[str] = ALL_ACTION_BREAKDOWNS,
level: TInsightsLevels = "ad",
action_attribution_windows: Sequence[str] = ALL_ACTION_ATTRIBUTION_WINDOWS,
batch_size: int = 50,
request_timeout: int = 300,
app_api_version: str = None,
) -> DltResource:
...

account_id: Account ID associated with ads manager, configured in config.toml.

access_token: Access token associated with the Business Facebook App, configured in secrets.toml.

initial_load_past_days: How many past days (starting from today) to initially load. Defaults to 30.

fields: A list of fields to include in each report. Note that the “breakdowns” option adds fields automatically. Defaults to DEFAULT_INSIGHT_FIELDS.

attribution_window_days_lag: Attribution window in days. The reports in the attribution window are refreshed on each run. Defaults to 7.

time_increment_days: The report aggregation window in days. Use 7 for weekly aggregation. Defaults to 1.

breakdowns: Presents with common aggregations. See settings.py for details. Defaults to "ads_insights_age_and_gender".

action_breakdowns: Action aggregation types. See settings.py for details. Defaults to ALL_ACTION_BREAKDOWNS.

level: The granularity level. Defaults to "ad".

action_attribution_windows: Attribution windows for actions. Defaults to ALL_ACTION_ATTRIBUTION_WINDOWS.

batch_size: Page size when reading data from a particular report. Defaults to 50.

request_timeout: Connection timeout. Defaults to 300.

app_api_version: A version of the Facebook API required by the app for which the access tokens were issued, e.g., 'v17.0'. Defaults to the facebook_business library default version.

Resource facebook_insights

This function fetches Facebook insights data incrementally from a specified start date until the current date, in day steps.

@dlt.resource(primary_key=INSIGHTS_PRIMARY_KEY, write_disposition="merge")
def facebook_insights(
date_start: dlt.sources.incremental[str] = dlt.sources.incremental(
"date_start", initial_value=initial_load_start_date_str
)
) -> Iterator[TDataItems]:
...

date_start: Parameter sets the initial value for the "date_start" parameter in dlt.sources.incremental. It is based on the last pipeline run or defaults to today's date minus the specified number of days in the "initial_load_past_days" parameter.

Customization

Create your own pipeline

If you wish to create your own pipelines, you can leverage source and resource methods from this verified source.

  1. Configure the pipeline by specifying the pipeline name, destination, and dataset as follows:

    pipeline = dlt.pipeline(
    pipeline_name="facebook_ads", # Use a custom name if desired
    destination="duckdb", # Choose the appropriate destination (e.g., duckdb, redshift, post)
    dataset_name="facebook_ads_data" # Use a custom name if desired
    )

    To read more about pipeline configuration, please refer to our documentation.

  2. To load all the data from campaigns, ad sets, ads, ad creatives, and leads:

    load_data = facebook_ads_source()
    load_info = pipeline.run(load_data)
    print(load_info)
  3. To merge the Facebook Ads with the state "DISAPPROVED" and with ads state "PAUSED", you can do the following:

    load_data = facebook_ads_source()
    # It is recommended to enable root key propagation on a source that is not a merge one by default. This is not required if you always use merge but below we start with replace
    load_data.root_key = True

    # Load only disapproved ads
    load_data.ads.bind(states=("DISAPPROVED",))
    load_info = pipeline.run(load_data.with_resources("ads"), write_disposition="replace")
    print(load_info)

    # Here we merge the paused ads but the disapproved ads stay there!
    load_data = facebook_ads_source()
    load_data.ads.bind(states=("PAUSED",))
    load_info = pipeline.run(load_data.with_resources("ads"), write_disposition="merge")
    print(load_info)

    In the above steps, we first load the "ads" data with the "DISAPPROVED" state in replace mode and then merge the ads data with the "PAUSED" state on that.

  4. To load data with a custom field, for example, to load only "id" from Facebook ads, you can do the following:

    load_data = facebook_ads_source()
    # Only loads ad ids, works the same for campaigns, leads, etc.
    load_data.ads.bind(fields=("id",))
    load_info = pipeline.run(load_data.with_resources("ads"))
    print(load_info)
  5. This pipeline includes an enrichment transformation called enrich_ad_objects that you can apply to any resource to obtain additional data per object using object.get_api. The following code demonstrates how to enrich objects by adding an enrichment transformation that includes additional fields.

    # You can reduce the chunk size for smaller requests
    load_data = facebook_ads_source(chunk_size=2)

    # Request only the "id" field for ad_creatives
    load_data.ad_creatives.bind(fields=("id",))

    # Add a transformation to the ad_creatives resource
    load_data.ad_creatives.add_step(
    # Specify the AdCreative object type and request the desired fields
    enrich_ad_objects(AdCreative, DEFAULT_ADCREATIVE_FIELDS)
    )

    # Run the pipeline with the ad_creatives resource
    load_info = pipeline.run(load_data.with_resources("ad_creatives"))

    print(load_info)

    In the above code, the "load_data" object represents the Facebook Ads source, and we specify the desired chunk size for the requests. We then bind the "id" field for the "ad_creatives" resource using the "bind()" method.

    To enrich the adcreatives objects, we add a transformation using the "add_step()" method. The "enrich_ad_objects" function is used to specify the AdCreative object type and request the fields defined in _DEFAULT_ADCREATIVE_FIELDS.

    Finally, we run the pipeline with the ad_creatives resource and store the load information in the load_info.

  6. You can also load insights reports incrementally with defined granularity levels, fields, breakdowns, etc., as defined in the facebook_insights_source. This function generates daily reports for a specified number of past days.

    load_data = facebook_insights_source(
    initial_load_past_days=30,
    attribution_window_days_lag=7,
    time_increment_days=1
    )
    load_info = pipeline.run(load_data)
    print(load_info)

By default, daily reports are generated from initial_load_past_days ago to today. On subsequent runs, only new reports are loaded, with the past attribution_window_days_lag days (default is 7) being refreshed to accommodate any changes. You can adjust time_increment_days to change report frequency (default set to one).

This demo works on codespaces. Codespaces is a development environment available for free to anyone with a Github account. You'll be asked to fork the demo repository and from there the README guides you with further steps.
The demo uses the Continue VSCode extension.

Off to codespaces!

DHelp

Ask a question

Welcome to "Codex Central", your next-gen help center, driven by OpenAI's GPT-4 model. It's more than just a forum or a FAQ hub – it's a dynamic knowledge base where coders can find AI-assisted solutions to their pressing problems. With GPT-4's powerful comprehension and predictive abilities, Codex Central provides instantaneous issue resolution, insightful debugging, and personalized guidance. Get your code running smoothly with the unparalleled support at Codex Central - coding help reimagined with AI prowess.