AI Generator


AI Generator


The CData Connect AI Generator creates executable SQL queries from plain text prompts. You can simply ask a question about the data from a CData Connect connection and receive a ready-to-use SQL query that is generated by artificial intelligence (AI). AI Generator understands the data model and generates a query based on the table metadata in your data source.

This speeds up the process of creating SQL queries by reducing the manual time that you normally spend building a query. AI Generator manages complex queries on any connected data source in CData Connect.

Prerequisites

Before you can use AI Generator, you must properly configure a connection within CData Connect. For more information about configuring a connection, see Connections. You can select any of your pre-made connections that are inside of AI Generator to query.

AI Generator

To access the generator, click AI Generator on the top right of the Data Explorer page.

After you click AI Generator, an options pane opens on the right, as shown below:

Configure AI Generator by following the steps below:

  1. Select a CData Connect connection from the Connection list.
  2. Select the tables that you are interested in querying from the Tables list.
  3. Enter a plain text description in the SQL Prompt text box to describe the query that you want to execute on the data.
  4. Click Generate SQL. The SQL statement that is generated by AI appears in the code box.
  5. Click Execute. The query is copied into the data explorer’s SQL box. You can also click Copy to copy the SQL query and paste it manually.
  6. Click Execute again to run the query on the data. The result appears underneath the data explorer SQL box.

Secure Query Generation

After you select your connection, the tables available in that connection populate the Tables list. The plain English that you enter is bundled with the metadata (only column names) of the tables you selected and is sent to a Microsoft Azure OpenAI endpoint as a prompt. For security reasons, no data beyond this metadata is transmitted to Azure. The service generates an SQL query and returns it to CData Connect. Once the SQL is generated, the metadata is discarded.

Tokens

Large Language Model (LLM) algorithms transmit data to their models by translating the characters in the request into tokens. By calculating the tokens that are required, CData Connect judges the complexity of the request and the processing power that is needed to complete it. In general, a token consists of four characters.

Below the SQL Prompt text box (bottom right), you can see the total number of tokens that you use per each query. The model can process 4000 tokens per query in a single request. Since this includes both the input and result, we enforce a 2000 token limit on the input. Each user can submit thirty requests per month. That is, each user can click Generate SQL thirty times in total. Requests that are caused by networking errors, like timeouts, are not counted. Errors caused by invalid prompts are still counted towards the monthly credit. This limit is reset each month, but unused requests do not roll over.

Note: The model has a per-minute processing capacity. If multiple users send requests simultaneously, the service might stall or need extra time to transmit a response.

Data Privacy

CData considers customer privacy to be extremely important. The data shared between CData Connect and Microsoft Azure OpenAI for purposes of the AI Generator is kept confidential and discarded properly. For more details, see the Terms of Service.