Run SQL Directly Against Files in Cloud Object Store with APEX
4 min read
Did you know that using APEX PL/SQL APIs, you can run SQL queries against flat files stored in Cloud Object Storage? In this short, I will take you through an example of how you can do this with Oracle Cloud Infrastructure (OCI) Object Store. The principles are the same for other object stores, e.g., AWS S3, Microsoft SharePoint, etc.
- You must have the root SSL certificate related to the Cloud Provider Object Store URL installed in your database wallet. This is taken care of for you if you use apex.oracle.com, Oracle Autonomous Database, or APEX Service.
- You must have an APEX Web Credential that has read access to a bucket in OCI Object Store. See my post Secure APEX Access to OCI Object Storage for details on how to do this.
- I downloaded a CSV file from Kaggle with International football ⚽ results from 1872 to 2022.
- The file has 43,752 records and is 3.12 MB in size
- I uploaded the CSV to an Oracle Cloud Infrastructure (OCI) Object Store Bucket called
- The URL for the file in OCI is
- I have an APEX web credential called
APEX_OCI_BLOG_CREDENTIAL, which has read access to the OCI bucket
Oracle APEX PL/SQL APIs
We will work with two APEX PL/SQL APIs, which make querying files in cloud object stores possible. These are apex_data_parser and apex_web_service.
Determine the File Structure
The first thing we need to do is understand the file's structure.
SELECT * FROM apex_data_parser.get_columns (apex_data_parser.discover (p_content => apex_web_service.make_rest_request_b (p_url => 'https://objectstorage.us-ashburn-1.oraclecloud.com/n/XXXX/b/APEX_OCI_BLOG_FILES/o/results.csv', p_http_method => 'GET', p_credential_static_id => 'APEX_OCI_BLOG_CREDENTIAL'), p_max_rows => 1000, p_file_name =>'test.csv' ))
- The API apex_web_service.make_rest_request_b fetches the CSV file from OCI Object Storage and returns a BLOB. The Web Credential
APEX_OCI_BLOG_CREDENTIALis used to authenticate into OCI Object Storage
- The result from apex_web_service.make_rest_request_b is passed into apex_data_parser.discover, which inspects the file and creates a JSON document describing the files structure
- The name for p_file_name is not important, but the extension is. The extension is used to determine the file type that is being parsed
- p_max_rows tells the API to only look at the first 1,000 records when inspecting the structure
- The JSON from apex_data_parser.discover is passed into apex_data_parser.get_columns which generates a columnar representation of the JSON like this:
Web Credentials can be used with
apex_web_serviceoutside of the context of an APEX Session, as long as the schema is mapped to an APEX Workspace. If the schema is mapped to multiple workspaces, you must first call
APEX_UTIL.SET_SECURITY_GROUP_ID. See documentation for details.
Querying the File
Now that we know the file's structure, we can run queries against it. In the SQL below, I am fetching the win, loss, and draw record for Wales 🏴. BTW, at the time of writing this post, it is 221 wins, 315 Losses, and 151 draws 😔
WITH scores AS (SELECT line_number , TO_DATE(col001,'YYYY"-"MM"-"DD') date_played , col002 home_team , col003 away_team , TO_NUMBER(col004) home_score , TO_NUMBER(col005) away_score , col006 tournament , col007 city , col008 country FROM apex_data_parser.parse (p_content => apex_web_service.make_rest_request_b (p_url => 'https://objectstorage.us-ashburn-1.oraclecloud.com/n/XXXX/b/APEX_OCI_BLOG_FILES/o/results.csv', p_http_method => 'GET', p_credential_static_id => 'APEX_OCI_BLOG_CREDENTIAL'), p_skip_rows => 1, p_detect_data_types => 'N', p_file_name => 'test.csv')) SELECT SUM(CASE WHEN home_team = 'Wales' AND home_score > away_score THEN 1 WHEN away_team = 'Wales' AND away_score > home_score THEN 1 ELSE 0 END) number_of_wins , SUM(CASE WHEN home_team = 'Wales' AND home_score < away_score THEN 1 WHEN away_team = 'Wales' AND away_score < home_score THEN 1 ELSE 0 END) number_of_losses , SUM(CASE WHEN home_team = 'Wales' AND home_score = away_score THEN 1 WHEN away_team = 'Wales' AND away_score = home_score THEN 1 ELSE 0 END) number_of_draws FROM scores WHERE (home_team = 'Wales' OR away_team = 'Wales')
- The BLOB returned from apex_web_service.make_rest_request_b is passed into apex_data_parser.parse, which parses the file and returns rows
- apex_data_parser.parse returns a line number and generic columns col001 thru col300.
- I have aliased the generic column names with the names found during discovery
- p_skip_rows tells the API to skip the first row
- p_detect_data_types tells the API not to inspect the data types in the file (we already know these from the previous step). This helps with performance
- Read my previous post Easy file Parsing with the APEX_DATA_PARSER API for more details on the APEX_DATA_PARSER API
I showed you how to query a CSV file stored in Oracle Object Storage using APEX PL/SQL APIs. You can use the same APIs to query Excel, JSON, and XML files.
You can run these same SQL statements from APEX Components such as Interactive Reports, Classic Reports, etc. You can incorporate this technique into PL/SQL packages for more complex logic.