Introduction
As data grows exponentially, organizations are continuously seeking more efficient and powerful tools to manage and analyze their data. The Explore tab, which utilizes the Axiom Processing Language (APL), is one such service that offers fast, scalable, and interactive data exploration capabilities. If you are an SQL user looking to migrate to APL, this guide will provide a gentle introduction to help you make the transition smoothly.
This tutorial will guide you through migrating SQL to APL, helping you understand key differences and providing you with query examples.
Introduction to Axiom Processing Language (APL)
Axiom Processing Language (APL) is the language used by the Explore tab, a fast and highly scalable data exploration service. APL is optimized for real-time and historical data analytics, making it a suitable choice for various data analysis tasks.
Tabular operators: In APL, there are several tabular operators that help you manipulate and filter data, similar to SQL’s SELECT, FROM, WHERE, GROUP BY, and ORDER BY clauses. Some of the commonly used tabular operators are:
extend
: Adds new columns to the result set.
project
: Selects specific columns from the result set.
where
: Filters rows based on a condition.
summarize
: Groups and aggregates data similar to the GROUP BY clause in SQL.
sort
: Sorts the result set based on one or more columns, similar to ORDER BY in SQL.
Key differences between SQL and APL
While SQL and APL are query languages, there are some key differences to consider:
- APL is designed for querying large volumes of structured, semi-structured, and unstructured data.
- APL is a pipe-based language, meaning you can chain multiple operations using the pipe operator (
|
) to create a data transformation flow.
- APL doesn’t use SELECT, and FROM clauses like SQL. Instead, it uses keywords such as summarize, extend, where, and project.
- APL is case-sensitive, whereas SQL isn’t.
Benefits of migrating from SQL to APL:
-
Time Series Analysis: APL is particularly strong when it comes to analyzing time-series data (logs, telemetry data, etc.). It has a rich set of operators designed specifically for such scenarios, making it much easier to handle time-based analysis.
-
Pipelining: APL uses a pipelining model, much like the UNIX command line. You can chain commands together using the pipe (|
) symbol, with each command operating on the results of the previous command. This makes it very easy to write complex queries.
-
Easy to Learn: APL is designed to be simple and easy to learn, especially for those already familiar with SQL. It does not require any knowledge of database schemas or the need to specify joins.
-
Scalability: APL is a more scalable platform than SQL. This means that it can handle larger amounts of data.
-
Flexibility: APL is a more flexible platform than SQL. This means that it can be used to analyze different types of data.
-
Features: APL offers more features and capabilities than SQL. This includes features such as real-time analytics, and time-based analysis.
Basic APL Syntax
A basic APL query follows this structure:
| <DatasetName>
| <FilteringOperation>
| <ProjectionOperation>
| <AggregationOperation>
Query Examples
Let’s see some examples of how to convert SQL queries to APL.
SELECT with a simple filter
SQL:
SELECT *
FROM [Sample-http-logs]
WHERE method = 'GET';
APL:
['sample-http-logs']
| where method == 'GET'
Run in Playground
COUNT with GROUP BY
SQL:
SELECT Country, COUNT(*)
FROM [Sample-http-logs]
GROUP BY method;
APL:
['sample-http-logs']
| summarize count() by method
Run in Playground
Top N results
SQL:
SELECT TOP 10 Status, Method
FROM [Sample-http-logs]
ORDER BY Method DESC;
APL:
['sample-http-logs']
| top 10 by method desc
| project status, method
Run in Playground
Simple filtering and projection
SQL:
SELECT method, status, geo.country
FROM [Sample-http-logs]
WHERE resp_header_size_bytes >= 18;
APL:
['sample-http-logs']
| where resp_header_size_bytes >= 18
| project method, status, ['geo.country']
Run in Playground
COUNT with a HAVING clause
SQL:
SELECT geo.country
FROM [Sample-http-logs]
GROUP BY geo.country
HAVING COUNT(*) > 100;
APL:
['sample-http-logs']
| summarize count() by ['geo.country']
| where count_ > 100
Run in Playground
Multiple Aggregations
SQL:
SELECT geo.country,
COUNT(*) AS TotalRequests,
AVG(req_duration_ms) AS AverageRequest,
MIN(req_duration_ms) AS MinRequest,
MAX(req_duration_ms) AS MaxRequest
FROM [Sample-http-logs]
GROUP BY geo.country;
APL:
Users
| summarize TotalRequests = count(),
AverageRequest = avg(req_duration_ms),
MinRequest = min(req_duration_ms),
MaxRequest = max(req_duration_ms) by ['geo.country']
Run in Playground
Sum of a column
SQL:
SELECT SUM(resp_body_size_bytes) AS TotalBytes
FROM [Sample-http-logs];
APL:
[‘sample-http-logs’]
| summarize TotalBytes = sum(resp_body_size_bytes)
Run in Playground
Average of a column
SQL:
SELECT AVG(req_duration_ms) AS AverageRequest
FROM [Sample-http-logs];
APL:
['sample-http-logs']
| summarize AverageRequest = avg(req_duration_ms)
Run in Playground
Minimum and Maximum Values of a column
SQL:
SELECT MIN(req_duration_ms) AS MinRequest, MAX(req_duration_ms) AS MaxRequest
FROM [Sample-http-logs];
APL:
['sample-http-logs']
| summarize MinRequest = min(req_duration_ms), MaxRequest = max(req_duration_ms)
Run in Playground
Count distinct values
SQL:
SELECT COUNT(DISTINCT method) AS UniqueMethods
FROM [Sample-http-logs];
APL:
['sample-http-logs']
| summarize UniqueMethods = dcount(method)
Run in Playground
Standard deviation of a data
SQL:
SELECT STDDEV(req_duration_ms) AS StdDevRequest
FROM [Sample-http-logs];
APL:
['sample-http-logs']
| summarize StdDevRequest = stdev(req_duration_ms)
Run in Playground
Variance of a data
SQL:
SELECT VAR(req_duration_ms) AS VarRequest
FROM [Sample-http-logs];
APL:
['sample-http-logs']
| summarize VarRequest = variance(req_duration_ms)
Run in Playground
Multiple aggregation functions
SQL:
SELECT COUNT(*) AS TotalDuration, SUM(req_duration_ms) AS TotalDuration, AVG(Price) AS AverageDuration
FROM [Sample-http-logs];
APL:
['sample-http-logs']
| summarize TotalOrders = count(), TotalDuration = sum( req_duration_ms), AverageDuration = avg(req_duration_ms)
Run in Playground
Aggregation with GROUP BY and ORDER BY
SQL:
SELECT status, COUNT(*) AS TotalStatus, SUM(resp_header_size_bytes) AS TotalRequest
FROM [Sample-http-logs];
GROUP BY status
ORDER BY TotalSpent DESC;
APL:
['sample-http-logs']
| summarize TotalStatus = count(), TotalRequest = sum(resp_header_size_bytes) by status
| order by TotalRequest desc
Run in Playground
Count with a condition
SQL:
SELECT COUNT(*) AS HighContentStatus
FROM [Sample-http-logs];
WHERE resp_header_size_bytes > 1;
APL:
['sample-http-logs']
| where resp_header_size_bytes > 1
| summarize HighContentStatus = count()
Run in Playground
Aggregation with HAVING
SQL:
SELECT Status
FROM [Sample-http-logs];
GROUP BY Status
HAVING COUNT(*) > 10;
APL:
['sample-http-logs']
| summarize OrderCount = count() by status
| where OrderCount > 10
Run in Playground
Count occurrences of a value in a field
SQL:
SELECT content_type, COUNT(*) AS RequestCount
FROM [Sample-http-logs];
WHERE content_type = ‘text/csv’;
APL:
['sample-http-logs'];
| where content_type == 'text/csv'
| summarize RequestCount = count()
Run in Playground
String Functions:
Length of a string
SQL:
SELECT LEN(Status) AS NameLength
FROM [Sample-http-logs];
APL:
['sample-http-logs']
| extend NameLength = strlen(status)
Run in Playground
Concatentation
SQL:
SELECT CONCAT(content_type, ' ', method) AS FullLength
FROM [Sample-http-logs];
APL:
['sample-http-logs']
| extend FullLength = strcat(content_type, ' ', method)
Run in Playground
Substring
SQL:
SELECT SUBSTRING(content_type, 1, 10) AS ShortDescription
FROM [Sample-http-logs];
APL:
['sample-http-logs']
| extend ShortDescription = substring(content_type, 0, 10)
Run in Playground
Left and Right
SQL:
SELECT LEFT(content_type, 3) AS LeftTitle, RIGHT(content_type, 3) AS RightTitle
FROM [Sample-http-logs];
APL:
['sample-http-logs']
| extend LeftTitle = substring(content_type, 0, 3), RightTitle = substring(content_type, strlen(content_type) - 3, 3)
Run in Playground
Replace
SQL:
SELECT REPLACE(StaTUS, 'old', 'new') AS UpdatedStatus
FROM [Sample-http-logs];
APL:
['sample-http-logs']
| extend UpdatedStatus = replace('old', 'new', status)
Run in Playground
Upper and Lower
SQL:
SELECT UPPER(FirstName) AS UpperFirstName, LOWER(LastName) AS LowerLastName
FROM [Sample-http-logs];
APL:
['sample-http-logs']
| project upperFirstName = toupper(content_type), LowerLastNmae = tolower(status)
Run in Playground
LTrim and RTrim
SQL:
SELECT LTRIM(content_type) AS LeftTrimmedFirstName, RTRIM(content_type) AS RightTrimmedLastName
FROM [Sample-http-logs];
APL:
['sample-http-logs']
| extend LeftTrimmedFirstName = trim_start(' ', content_type), RightTrimmedLastName = trim_end(' ', content_type)
Run in Playground
Trim
SQL:
SELECT TRIM(content_type) AS TrimmedFirstName
FROM [Sample-http-logs];
APL:
['sample-http-logs']
| extend TrimmedFirstName = trim(' ', content_type)
Run in Playground
Reverse
SQL:
SELECT REVERSE(Method) AS ReversedFirstName
FROM [Sample-http-logs];
APL:
['sample-http-logs']
| extend ReversedFirstName = reverse(method)
Run in Playground
Case-insensitive search
SQL:
SELECT Status, Method
FROM “Sample-http-logs”
WHERE LOWER(Method) LIKE 'get’';
APL:
['sample-http-logs']
| where tolower(method) contains 'GET'
| project status, method
Run in Playground
Take the First Step Today: Dive into APL
The journey from SQL to APL might seem daunting at first, but with the right approach, it can become an empowering transition. It is about expanding your data query capabilities to leverage the advanced, versatile, and fast querying infrastructure that APL provides. In the end, the goal is to enable you to draw more value from your data, make faster decisions, and ultimately propel your business forward.
Try converting some of your existing SQL queries to APL and observe the performance difference. Explore the Axiom Processing Language and start experimenting with its unique features.
Happy querying!