Bronze Layer · Data Source

Salesforce

CRM data synced via REST and Bulk API. 20 objects covering accounts, contacts, financial accounts, opportunities, contracts, cases, users, and custom Financial Services Cloud objects.

20Objects
5:00 AMUTC Daily
PostgreSQLTarget DB
NoneArchive
I

Data Source

PropertyValue
Source TypeCRM — REST API / Bulk API (SOQL queries)
Delivery MethodHTTPS API — direct query per object
AuthenticationOAuth 2.0 — redwood-salesforce-config (Azure Key Vault)
Sync Modesfull incremental smart
Strategiestruncate recreate smart
Incremental fieldSystemModstamp (configurable per object)
Bulk APIUsed for large objects (configurable per object)
Staging directory/tmp/salesforce_staging
Client PackageSalesforceObjectClient (internal)
DAGredwood_salesforce_sync
DAG Fileredwood_salesforce_sync.py

Each active row in salesforce_sync_config generates a Airflow task sync__{object_name}. Tasks run with trigger_rule='all_done' so one object failure does not block others. All tasks retry 3 times with 5-second delays. A final cleanup task removes the staging directory.

II

Objects Being Synced

salesforce_accounts
Account
Firm/client account master (607 columns incl. FinServ custom fields)
salesforce_accounts_new
Account
Alternate account table (603 columns)
salesforce_contacts
Contact
Individual contact records (217 columns)
salesforce_leads
Lead
Prospect/lead data (130 columns)
salesforce_opportunities
Opportunity
Sales pipeline opportunities (92 columns)
salesforce_cases
Case
Support cases (522 columns)
salesforce_events
Event
Calendar events (89 columns)
salesforce_tasks
Task
Task records (80 columns)
salesforce_contracts
Contract
Contract master (126 columns)
salesforce_contract_line_items
ContractLineItem
Line items on contracts (74 columns)
salesforce_financial_accounts
FinancialAccount
Financial account relationships (213 columns)
salesforce_users
User
Salesforce user directory (222 columns)
salesforce_record_types
RecordType
Record type mapping (18 columns)
salesforce_models
Model__c
Investment model definitions (27 columns)
salesforce_investor_profiles
InvestorProfile__c
Investor risk/suitability profiles (22 columns)
salesforce_investor_profile_history
InvestorProfileHistory
Profile change audit (13 columns)
salesforce_trade_tickets
TradeTicket__c
Trade ticket records (76 columns)
salesforce_account_account_relations
AccountAccountRelation
Account-to-account relationships
salesforce_account_contact_relations
AccountContactRelation
Account-to-contact relationships
salesforce_account_contract_relations
AccountContractRelation
Account-to-contract relationships
III

Config, Logs, Monitoring & Schedule

Configuration Table — salesforce_sync_config

CREATE TABLE public.salesforce_sync_config (
    sf_object_name  TEXT NOT NULL,   -- e.g. Account, Contact, FinancialAccount
    table_name      TEXT NOT NULL,   -- destination table
    sync_mode       TEXT NOT NULL,   -- full | incremental
    strategy        TEXT NOT NULL,   -- smart | truncate | recreate
    tracking_field  TEXT NOT NULL,   -- default: SystemModstamp
    use_bulk_api    BOOLEAN NOT NULL,
    is_active       BOOLEAN NOT NULL
);

Metadata / Log Table — salesforce_sync_metadata

ColumnTypeDescription
sf_object_nameTEXTObject synced
table_nameTEXTDestination table
sync_modeTEXTMode used
strategyTEXTStrategy used
records_fetchedINTEGERAPI records received
records_loadedINTEGERRecords written to DB
sync_started_atTIMESTAMPRun start
sync_completed_atTIMESTAMPRun end
statusTEXTsuccess failed
error_messageTEXTError detail if failed

Monitoring & Alerting

PropertyValue
Notifierairflow_email_notifiersend_dag_notification
Subject prefix[Redwood-Salesforce]
On success / failureYes / Yes
Retries per task3 with 5-second delays
Execution timeout3 hours
Task isolationtrigger_rule='all_done' — one object failure doesn't block others
In-pipeline validationRecord count assertion per object · smart delta detection

Email recipients: prashant.surana@collation.ai · monitor@collation.ai · mschwartz@sequoia-financial.com · handerson@sequoia-financial.com · vzivich@sequoia-financial.com · hpatel@sequoia-financial.com · karishni.mehta@collation.ai

DAG Schedule

5:00 AM UTC
0 5 * * * · Daily

Max active runs: default · Catchup: disabled · Start date: 2024-01-01

IV

Bronze Layer — Database Schema

Standard columns (all tables): Id TEXT · IsDeleted BOOLEAN · CreatedDate TIMESTAMP · LastModifiedDate TIMESTAMP · SystemModstamp TIMESTAMP

salesforce_accounts — Account Master (607 columns)

Core standard fields plus extensive FinancialServices Cloud and Tamarac namespace custom fields.

-- Standard fields
Id                               TEXT,
Name                             TEXT,
Type                             TEXT,
RecordTypeId                     TEXT,
BillingStreet                    TEXT,
BillingCity                      TEXT,
BillingState                     TEXT,
BillingPostalCode                TEXT,
BillingCountry                   TEXT,
Phone                            TEXT,
Fax                              TEXT,
Website                          TEXT,
AnnualRevenue                    NUMERIC,
NumberOfEmployees                INTEGER,
Industry                         TEXT,
OwnerId                          TEXT,
IsDeleted                        BOOLEAN,
CreatedDate                      TIMESTAMP,
LastModifiedDate                 TIMESTAMP,
SystemModstamp                   TIMESTAMP,
-- FinancialServices Cloud (FinServ__) namespace
FinServ__PrimaryContact__c       TEXT,
FinServ__InvestmentObjectives__c TEXT,
FinServ__RiskTolerance__c        TEXT,
FinServ__TotalAssets__c          NUMERIC,
FinServ__AnnualIncome__c         NUMERIC,
FinServ__NetWorth__c             NUMERIC,
-- Tamarac namespace
Tamarac__HouseholdId__c          TEXT,
Tamarac__AccountNumber__c        TEXT,
Tamarac__CustodianCode__c        TEXT
-- ... plus ~570 additional custom fields

salesforce_contacts — Contacts (217 columns)

Id                     TEXT,
AccountId              TEXT,
FirstName              TEXT,
LastName               TEXT,
Email                  TEXT,
Phone                  TEXT,
MobilePhone            TEXT,
Title                  TEXT,
Department             TEXT,
MailingStreet          TEXT,
MailingCity            TEXT,
MailingState           TEXT,
MailingPostalCode      TEXT,
Birthdate              DATE,
DoNotCall              BOOLEAN,
HasOptedOutOfEmail     BOOLEAN,
OwnerId                TEXT,
IsDeleted              BOOLEAN,
CreatedDate            TIMESTAMP,
LastModifiedDate       TIMESTAMP,
SystemModstamp         TIMESTAMP
-- Plus custom FinServ__ and other namespace fields

salesforce_financial_accounts — Financial Accounts (213 columns)

Id                                TEXT,
Name                              TEXT,
FinServ__PrimaryOwner__c          TEXT,
FinServ__FinancialAccountType__c  TEXT,
FinServ__Balance__c               NUMERIC,
FinServ__OpenDate__c              DATE,
FinServ__HeldAway__c              BOOLEAN,
FinServ__SourceSystemId__c        TEXT,
FinServ__CustodianCode__c         TEXT,
Tamarac__AccountNumber__c         TEXT,
Tamarac__TaxId__c                 TEXT,
IsDeleted                         BOOLEAN,
CreatedDate                       TIMESTAMP,
SystemModstamp                    TIMESTAMP
-- Plus ~200 additional FinServ__ custom fields

salesforce_opportunities — Opportunities (92 columns)

Id             TEXT,
AccountId      TEXT,
Name           TEXT,
StageName      TEXT,
Amount         NUMERIC,
CloseDate      DATE,
Probability    NUMERIC,
Type           TEXT,
LeadSource     TEXT,
Description    TEXT,
OwnerId        TEXT,
IsClosed       BOOLEAN,
IsWon          BOOLEAN,
CreatedDate    TIMESTAMP,
SystemModstamp TIMESTAMP

salesforce_cases — Cases (522 columns)

Id             TEXT,
AccountId      TEXT,
ContactId      TEXT,
Subject        TEXT,
Description    TEXT,
Status         TEXT,
Priority       TEXT,
Type           TEXT,
Reason         TEXT,
Origin         TEXT,
IsClosed       BOOLEAN,
OwnerId        TEXT,
CreatedDate    TIMESTAMP,
SystemModstamp TIMESTAMP
-- Plus ~508 custom fields

salesforce_users — Users (222 columns)

Id               TEXT,
Username         TEXT,
FirstName        TEXT,
LastName         TEXT,
Email            TEXT,
Title            TEXT,
Department       TEXT,
UserType         TEXT,
IsActive         BOOLEAN,
ProfileId        TEXT,
UserRoleId       TEXT,
CreatedDate      TIMESTAMP,
LastModifiedDate TIMESTAMP,
SystemModstamp   TIMESTAMP

Other Objects

TableColumnsKey Fields
salesforce_contracts126AccountId, Status, StartDate, EndDate, ContractNumber, OwnerId
salesforce_contract_line_items74ContractId, Product2Id, Quantity, UnitPrice, TotalPrice
salesforce_leads130FirstName, LastName, Email, Phone, Company, Status, OwnerId
salesforce_events89Subject, StartDateTime, EndDateTime, OwnerId, WhoId, WhatId
salesforce_tasks80Subject, Status, Priority, OwnerId, WhoId, WhatId, ActivityDate
salesforce_models27Name, Model__c fields, strategy, risk profile
salesforce_investor_profiles22AccountId, risk tolerance, objectives, suitability fields
salesforce_investor_profile_history13AccountId, field changed, old value, new value, date
salesforce_trade_tickets76AccountId, security, quantity, price, trade type, status
salesforce_record_types18Name, SobjectType, IsActive, Description
salesforce_account_account_relationsAccountFromId, AccountToId, Roles
salesforce_account_contact_relationsAccountId, ContactId, Roles, IsDirect
salesforce_account_contract_relationsAccountId, ContractId
V

Archiving Policy

No dedicated file archival DAG for Salesforce. Salesforce data is a live API — there are no source files to archive. Database-level archival of Salesforce tables may be configured via redwood_database_archive (see config table database_archive_config). The redwood_database_archive DAG runs daily at 2:00 AM UTC and dumps enabled tables to Azure Blob cold storage as compressed ZIPs.
SettingValue
Database archive DAGredwood_database_archive
Schedule0 2 * * * — Daily 2:00 AM UTC
Cold storage path{blob_container}/{blob_path_prefix}/{YYYYMMDD}/{table}_{timestamp}.zip
Config tabledatabase_archive_config
Log tabledatabase_archive_log