Skip to main content

Welcome to Dalgo

Leverage Dalgo to manage your Data. So that you can Learn from it.

Our open-source data platform enables NGOs to harness the power of data by automating data consolidation, transformation, storage and visualization through a unified interface.

This ensures that you spend no time on repetitive manual data crunching and can direct your efforts towards the use of data to monitor and evaluate your impact. Learning, iterating and communicating your impact internally and externally.

Visit dalgo.in to learn more about the product and pricing, or contact us at support@dalgo.in

Our team is always available to provide you with support via Discord. Join our server and chat with us on Dalgo Support.

Platform Overview

dashboard.dalgo.in is the interface for your M&E team, data analysts/engineers, or IT team. It enables you to set up and monitor your automated data pipelines through the following sections:

  1. Ingest: Set up your data warehouse>Connect to your sources of data>Connect your sources to your data warehouse.
  2. Transform: Connect to your DBT repository which contains the SQL code for your data transformation (cleaning/merging/computation).
  3. Orchestrate: Schedule your data ingestion and/or transformation.
  4. Pipeline Overview: Monitor the health of your pipeline with a view of all your past runs.
  5. Analysis: View your data on your Superset dashboards within the platform and ensure that it is being populated as per your expectations.
  6. User Management: Add relevant team members to your organisation and collaborate.

    Note : Superset will only be made available to you if you have subscribed to Dalgo with Superset.

Managing up your data pipeline

As a user of Version 1 of Dalgo your data pipelines will likely already be set up for you by us or by one of our partners. The steps below are intended to help you make changes to your pipeline as you use it over time.

1. Logging in as a first time user

  1. You will receive an invitation to the platform from notifications@dalgo.in.
1) Invite
  1. Clicking on the link will take you to the Dalgo platform.
  2. Accept the invitation and set up your password.
1 2)Accept invite
  1. You are now logged into the platform.

    Note : If the pipeline overview page says “No pipelines available. Please create one” then reach out to support@dalgo.in or to the partner who is helping you with setup.

2. Ingest

Through this step Dalgo connects your different data sources to a single warehouse.

2.1. Managing your Warehouse

Your warehouse is the single location where data from various sources is stored.

  1. Click on Ingest from the left menu pane and then select the ‘Your Warehouse’ tab.
  2. Dalgo currently supports BigQuery and PostgreSQL as warehouses for the platform. You should see one of these already set up.
  3. If you wish to edit your Warehouse name then click on the green edit button at the bottom of the window.
2 1 3  Edit Warehouse
  1. If you wish to use a different warehouse from what is currently set up then select ‘delete warehouse’ and confirm. Then select ‘add a new warehouse’.
  2. To set up a new warehouse name your warehouse, select the type of warehouse, fill in the relevant credentials, and click ‘save changes and test’.
2 15) Select Warehouse
2 152) Add WH Details

Note : Please seek advice from your internal tech team, your tech partner, or the Dalgo team (support@dalgo.in) if you need guidance on this.

2.2. Managing your Data Sources

Your data Sources are the different places where data lies. These could be Google Sheets, KoboToolbox, Avni, or CommCare to name a few. Dalgo can connect to over 300 data sources. We also develop connectors for new sources to meet your needs.

If your spreadsheet is viewable by anyone with its link, no further action is needed. If not, give your Service account access to your spreadsheet.

Create a service account

  1. Open the Service Accounts Page in your Google Cloud console.
  2. Select an existing project, or create a new project.
  3. At the top of the page, click + Create service account.
  4. Enter a name and description for the service account, then click Create and Continue.
  5. Under Service account permissions, select the roles to grant to the service account, then click Continue. We recommend the Viewer role.

Generate a key

  1. Go to the API Console/Credentials page and click on the email address of the service account you just created.
  2. In the Keys tab, click + Add key, then click Create new key.
  3. Select JSON as the Key type. This will generate and download the JSON key file that you'll use for authentication. Click Continue.

Enable the Google Sheets API

  1. Go to the API Console/Library page.
  2. Make sure you have selected the correct project from the top.
  3. Find and select the Google Sheets API.
  4. Click ENABLE.

If your spreadsheet is viewable by anyone with its link, no further action is needed. If not, give your Service account access to your spreadsheet.

  1. Click on the “Sources” tab in the Ingest section
  2. To add a source, click on “+ New Source”
2 2 2) Add source
  1. Give your source a unique name
  2. Select the type of source you want to add, and the required credentials for this source will appear.
2 2 4) Add source credentials
  1. Fill in the required credentials.
  2. Click ‘save changes and test’.
  3. If you have entered the correct credentials the source will be added.

    Note: If you do not have the required credentials for your selected source then contact the relevant person on your team who would have them. Else simply google it and you will find instructions on where and how to find these source credentials.

  4. To edit a source, click the 3 dots on the right of the source bar and select edit. Then click 'save changes and test'. Note: You cannot change the source type. Instead add a new source of the new type.
  5. To delete a source, click the 3 dots on the right of the source bar and select delete, then confirm.

2.3 Managing your Connections

This is the section where you direct the data coming in from your data sources into your warehouse and specify which tables from this source you want to sync and how you wish to sync them.

  1. Click on the ‘connections’ tab in the Ingest section
  2. To add a new connection select “+New connection”
2 3 2) Select connections
  1. Give your connection a name and select the source for which you want to build the connection. You will see all the source tables that you added appear in the streams column.
2 3 3) Ad connection
  1. Select whether you want the data to be normalised.
  2. Select the relevant streams (tables) from your data source that you wish to connect by toggling the sync button.
  3. Then select how you would like this data to be synced and click ‘Connect’.
  4. To test your configuration, select the ‘Sync’ button on the right side of the connections bar. The sync will begin to run and logs will populate in the section below.
N2 3 7) Manual sync- Blur

3. Transform

Dalgo runs data transformations (data cleaning, joining, computation) using dbt (data build tool) .

  1. Select Transform on the left menu panel.
3 1) Select Transform
  1. To set up your transformations click "connect and set up repo".
  2. Paste your github repo URL (where the code for your data transformations lies)
  3. Specify the target schema. (Generally ‘prod’ or ‘dev’, this depends on your dbt developer’s naming convention)
3 4) Set up Transform
  1. Click save
  2. To check your setup, select a function and click execute.
image
  1. The function will be executed and the logs displayed below.
image
  1. You can add a custom task
image

which will appear at the bottom of your list

image

4. Orchestrate

Through this step Dalgo enables you to automate your data pipeline by setting up scheduled ingestion and transformation.

  1. Select orchestrate on the left menu panel.
  2. Select “+ New Pipeline” -This will take you to the “Create a new Pipeline” screen
N4 2 SlctOrch
  1. Give your pipeline a name.
  2. Select one or more of the connections you have set up.
  3. Toggle ‘Transform data?’ as per your needs.
  4. Set the schedule for your pipeline and click Save.
NN4 7 OrchDtls
  1. You can test your pipeline by clicking 'Run'.
N4 8  orch manual run
  1. You can view logs of your past runs by selecting ‘last logs’. Click ‘show more’ to see the details.
N4 9  Orch logs

Monitoring your data pipelines

Pipeline Overview

This section is intended to help you monitor the health of your data pipelines and provide you with a way to investigate further.

  1. Once you have set up at least one pipeline in the orchestration section you will see it in the overview section. Each pipeline will be represented separately.
N5 1 ovrvwsect
  1. Each vertical bar represents a pipeline run. A green bar represents success. A yellow bar represents a successful run, but a failure in ancillary functions, for example in a ‘DBT test’. A red line indicates that the pipeline run has failed.
  2. To investigate further, hover over the bar, note the start time, and click on check logs.
N5 3) Overview+logs
  1. This will take you to the orchestrate section, where you would need to select logs and check for logs corresponding to the start time of the relevant run. (ref. Step 8 in the orchestrate section above)

Analysis

Dalgo runs and offers a hosted version of Superset for visualisation. Subscription to superset is optional.

  1. If you have not subscribed to Superset you will see a message to this effect. Kindly contact support@dalgo.in If you wish to add Superset to your subscription.
  2. If you have subscribed to Superset you will see a button for Google sign-in, click on it.
N6 2) DSS sign in
  1. A pop-up window will appear. Select Sign In with Google.
NN6 3 gogle sign in
  1. If your Superset admin has granted access to your email ID, you will be successfully logged in. Else contact your Superset admin.
  2. Once successfully logged in, close the pop-up window. You will now be able to access Superset via Dalgo to build your charts and monitor if visualisations are being populated as expected.
N6 5 ss

Usage Dashboard

Dalgo offers a usage analytics dashboard for organizations who have subscribed to Superset Visualization. This dashboard gives you an overview of how well the visualizations have been adopted by the users in your organization. Insights drawn from here can be used to optimize dashboard utilization and decision making throughout your organization

Metrics tab

  1. Active Users - This bar graph tells you the total number of active and inactive users in your org. A user is defined as active if they have at least one visit to a dashboard.
  2. List of users - This table shows all the users in your org with their total visits to dashboard(s) assigned to them. You can map dashboards in Superset to a particular set of user(s) who are supposed to see them.
  3. List of dashboards - This table shows all the dashboards created under your org along with the total number of visits made to them.

Filters present at the top allow you to view your metrics across various dimensions of time (month), role and dashboard.

usage1

Trends tab

  1. No of users accessing the dashboard - This number card with trendline chart shows a monthly trend of total number of users accessing the dashboard(s). The number itself shows the number of users accessing the dashboard in the current month and the percentage shows the change in the number compared to the previous month.
  2. No of visits per user - This number card with trendline chart depicts the trend of average number of visits per user for the dashboard(s). This number is the average number of visits per user for the current month while the percentage shows the change from the previous month.

Top level filters allow you to look at the trend(s) across various slices of role and dashboard in your organization.

usage2

User Management

User management enables you to collaborate with relevant team members while using Dalgo. It llows you to add different users to the platform and assign them a role that grants the added user relevant 'view' or 'update' permissions

  1. Click on User Management in the left menu pane.
  2. In the ‘Users’ tab you will be able to see all your current users of Dalgo and their roles.
  3. To invite a user, select invite user, enter their email and the role you want to give them, and select send invitation.
image
  1. Once a user is invited you will see their name in the Pending invitations tab. You can choose to delete the invite or resend it using the three dots to the right.
  2. To delete a user or transfer role ownership select the 3 dots to the right of their name and then pick the relevant option, then confirm.
10)pendinginvite
  1. These are the available roles in Dalgo with their associated permissions:
RoleUser managementWarehouseSourcesConnectionsTransformOrchestrateSuperset Usage dashboard
Account ManagerUpdateUpdateUpdateUpdateUpdateUpdateView
Pipeline ManagerViewViewUpdateUpdateUpdateUpdateView
AnalystViewViewViewViewUpdateViewView
GuestViewViewViewViewViewViewView

Writing Schema Changes

When you need to make changes to your source data schema, Dalgo provides a streamlined process to ensure your sync pipelines are updated accordingly.

  1. Detecting Changes: Dalgo automatically detects any changes you make to your source data schema. This includes additions, deletions, or modifications of columns in your tables.

  2. Pending Changes Section: On the ingest page, you’ll find a "Pending Changes" section only if there is a schema change in any of the connections. This section lists all connections with detected schema changes, making it easy to manage them.

pending-actions
  1. Viewing Details: By clicking the "View" button next to each connection in the "Pending Changes" section, you can see detailed information about the changes, such as which columns have been added or removed.

  2. Approving or Ignoring Changes: You have the flexibility to either approve or ignore these changes based on your needs. Approving the changes will automatically sync your data with the updated schema. Once approved the pending actions tab will disappear.

catalog
  1. Handling Breaking Changes: If a change involves the removal of a critical field, such as a cursor field, it will be identified as a breaking change. These changes cannot be approved through Dalgo. Instead, you will need to resolve the issue at the source to ensure your connections remain operational.

  2. Seamless Syncing: Once the schema changes are approved, Dalgo will seamlessly sync the data with the updated schema, ensuring consistency and accuracy across your data transformations.

Note: It's important to ensure that your schema changes are compatible with your data sources and downstream applications. Consult with your data team or reach out to Dalgo support (support@dalgo.in) for assistance.

By following these steps, you can effectively manage schema changes within your Dalgo data pipeline, ensuring that your data transformations remain accurate and up-to-date.

AI Data Analysis v0.1

Dalgo's AI Data Analaysis allows you to leverage AI and ask Dalgo questions about data that's stored in your warehouse. Explore a range of possibilities; generate quick insights on your data, learn more about the data quality of your dataset, or even summarise qualitative data!

Prerequisite: You have to have your warehouse setup on Dalgo and the warehouse must contain some data.

Video

Steps to use the feature

  1. Navigate to AI Data Analysis in the left pane
  2. Enter an SQL query to select the table that you want to a analyse
  3. Select the 'Summarise' prompt and build on it or enter a custom proompt.
Screenshot 2024-11-13 at 2 51 08 PM
  1. Press submit and wait for the response to be generated.
Screenshot 2024-11-13 at 2 56 49 PM
  1. Iterate on your prompt or query to improve the ourput of your prompt.
Screenshot 2024-11-13 at 2 56 02 PM
  1. When you are satisfied with the output you can click on 'save as' to save the session so that you can access it again for future analyses.
  2. Click on saved sessions to access previously saved sessions
Screenshot 2024-11-13 at 3 01 52 PM
  1. You could also press the copy symbol to copy the response and paste it into a deck
Screenshot 2024-11-13 at 2 58 57 PM
  1. You may choose to download your SQL query, prompt, and response as a csv by clicking on the download button.
  2. If the response is unsatisfactory press the 'thumbs down' icon at the bottom right of the winodow and share your feedback with us.
Screenshot 2024-11-13 at 2 58 00 PM

Troubleshooting and FAQs

  1. If the response is not to your satisfaction then iterate on your prompt.
  2. If the page is taking too long to generate a response then reload it and try a different prompt.
  3. The output will not be representative of your entire dataset if your dataset is over 500 rows. This feature is currently limited to 500 rows of analysis.

Tips and Best Practices

  1. Give context on the data to the extent that you can
  2. If your column names are not representative of the data in the column then you should mention the column name in your prompt. For example: If my column is named 'xyz' and the data in this column is the number of covid deaths per country. Then In my prompt I should mention to caclulate total covid deaths from column 'xyz'
  3. Mention if you want the data to be numbered, in bullets, or in a table and how many points or words you want the data in. This will help format and limit your response.