Azure Cosmos DB Free Trial – walk through of Gremlin API for Graph

After the 2018 Microsoft Ignite event, Microsoft announced a free trial of Azure Cosmos DB. For those who are eager to check out Cosmos DB, this is a great opportunity to familiarize yourself with the hottest No SQL database in the market. Cosmos DB is currently sitting pretty at Rank 29 in the DB Engines Ranking page. Quite an achievement considering that the product is “only 1 year old”.

MS recently went public with the support for Cassandra API. A few years from now, Cosmos DB will be the most popular database offering from Microsoft. If you want to get a gentle intro to Cosmos DB, check out my previous posts – Introduction to Azure Cosmos DB and Azure Cosmos DB – Partition and Throughput.

This article is a walk through of using the Free Trial to get started with Cosmos DB.

Limited Time Free Trialhttps://azure.microsoft.com/en-au/try/cosmosdb/

There is no need for a credit card or subscription to avail this free trial. Please do note though that MS is likely to withdraw trial in a few months.

Step 1: Choosing an API/data model
Once you click the above URL, you will be asked to pick an API and data model. Go ahead and pick SQL to check out document (JSON) data model if you are unsure. I want to check out Gremlin API for Graph, so I am choosing Graph in this example.
You will be asked to login with your Microsoft account. If you don’t have one, create a new account.

CosmosDB Trial

Step 2: Choose default options and click on Create container.
Note that container will be created free of cost at a low throughput of 400 RU/s. If you are not familiar with throughput, read my article about Azure Cosmos DB – Partition and Throughput.
You now have a brand new container and the familiar Azure portal access. My trial container has read/write location of Central US.

Azure Portal Database

Closer look: There is a lot going on here, so let us take a closer look.
My container is called Persons and it is in graphdb database. Cosmos DB uses Apache Tinkerpop Gremlin API for graph traversal. Currently, my container Persons is empty, so let us connect a sample application. If you are new to Graph DB, I recommend reading this free ebook from another popular Graph DB product – Neo4j.
https://neo4j.com/graph-databases-book/

Graph DB - Persons

Step 3: Sample project
Click on the Quick start blade to download a sample project and explore graph data.
The sample project already has the connection string set to the trial database, so we can execute the project right away. How awesome is that!

Quickstart

In Step 1, I chose Graph API and model, so my sample project has Gremlin API queries. Read more about Gremlin API here – https://aka.ms/gremlin.

Gremlin API Queries

Step 4: Execute sample project
Now it’s time to execute the project!

Sample Program Execution

Step 5: Data Explorer
Looks like the program has added a number of graph documents to the container. Let us head over to Azure Portal and explore using Data Explorer blade. Click on the Execute Gremlin Query button to take a look at the data added by sample program. Now let us take a closer look at the output.

Graph DB Data Explorer.jpg

Closer look: A quick look shows that the query g.V() returned 4 nodes. Result can be viewed in either JSON or Graph mode.

Output - closer look 1

If we click on the second node – Ben – and zoom in, we get a nice graphical view of ben’s relationships. Click and explore to view related nodes.

Output - closer look 3

Switch to JSON mode to view result in JSON format. Pretty self-explanatory there 🙂

Output - closer look 2

Step 6: Monitor activity
Head over to Activity Log blade to take a look at activity in the Cosmos DB account. This could be downloaded as a .csv or exported to Event Hub for further analysis.

Activity Log

Summary

We took a look at the Cosmos DB Trial offered by Microsoft and got started with Graph API. Graph DB has a number of attractive use cases such as Fraud Detection and Recommendation Engine. Another popular Graph DB is Neo4j. I hope this article helped you understand how powerful Cosmos DB’s multi model support is. If you have any questions, drop me an email or add a comment. I will be happy to help. Add me on LinkedIn to stay connected – https://www.linkedin.com/in/arjunsivadasan/

Further Reading

  1. Introduction to Azure Cosmos DB and Azure Cosmos DB – Partition and Throughput
  2. Intro to Cassandra API in Cosmos DB – https://docs.microsoft.com/en-us/azure/cosmos-db/cassandra-introduction
  3. Intro to Graph DB – https://neo4j.com/graph-databases-book/
  4. Explore Gremlin API – https://aka.ms/gremlin
  5. Cosmos DB playground – https://www.documentdb.com/sql/demo
  6. Microsoft Ignite 2018 Updates – https://news.microsoft.com/uploads/prod/sites/507/2018/09/IGNITEBOOKOFNEWS-5ba95469d658b.pdf
Advertisements

Export data from Cosmos DB to SQL Server

Cosmos DB is Microsoft’s latest NoSQL database offering low latency, high scalability and geo-distribution to any Azure region. Read Microsoft documentation about Cosmos DB here or check out my blog about Introduction to Azure Cosmos DB. In its infancy, Cosmos DB was known as Document DB. Renaming was inevitable as Document DB evolved beyond just a document store.

Following is a simple note on how to export data from Cosmos DB to SQL Server. You may want a dump of data from a Cosmos DB collection to SQL Server for analysis, data integrity checks, troubleshooting a production issue or to derive insights.

There are a few methods to export data from Cosmos DB. The quickest one is to use Document DB / Cosmos DB Migration Tool. This is a tool provided by Microsoft  to migrate data TO/FROM various sources such as MongoDB, JSON, csv and SQL Server to Cosmos DB.

1. Use Azure Cosmos DB Migration tool to export data to json files:

  • Install the tool and launch as Administrator (use an Azure VM for best performance). Please be mindful of spike in RU costs when exporting data from your collection. To avoid throttling, scale your collection up as required just before you do the export or export in off-peak hours, if any.  Leave a comment if you need any pointer on this.LaunchCosmosDBMigrationTool
  • Choose source – DocumentDB (CosmosDB aka DocumentDB)

CosmosDBMigrationTool-Source

  • Specify connection string. You can find endpoint and key from Keys section of your CosmosDB account in Azure portal. Please note that CosmosDB is case-sensitive.

CosmosDBMigrationTool-ConnectionString

  • A peek into Advanced options
    • Include internal fields – for each document, Cosmos DB maintains a set of auto-generated internal/system fields such as _ts and _self. Their names start with an underscore, making it easy to differentiate them from user fields. This option lets you include internal fields in the output. This is handy, especially the _ts field, which indicates when the document was last updated.
    • Number of retries on failure & Retry interval – Set a reasonable number of retries. In this case, I have used the value 1.
    • Connection Mode you want to use Gateway to get best performance and to bypass firewall rules.

CosmosDBMigrationTool-AdvancedOptions

  • Specify target information. In this case, we want to export to a json document. We could either output to a local file or a blob.

CosmosDBMigrationTool-TargetInformation

  • Error Logging. Set these options to enable error logging.

CosmosDBMigrationTool-ErrorLogging

Hit Import and if there are no errors, you will soon have a new json file with all the data from your collection. If you want only a subset of data, all you need to do is modify your source query.

2. Import json files to SQL Server:

SQL Server 2016 introduced a JSON parse function called OPENJSON. If none of your user databases are upgraded to 2016 yet, but you have a 2016 engine, context-switch to a system database to use OPENJSON.

DECLARE @productNutrition varchar(max);

--Read from the json file using openrowset
SELECT @productNutrition = BulkColumn
FROM OPENROWSET(BULK'C:\Users\Smruthi\Downloads\Arjun\productnutrition_20180427.json', SINGLE_BLOB) JSON;

--Pass the variable containing json as parameter to OPENJSON function
SELECT *
FROM OPENJSON (@productNutrition)
WITH
(
ProductID varchar(20) '$.id',
ProductDescription varchar(100) '$.description',
ProductGroup varchar(200) '$.foodGroup',
ServingAmount float '$.servings[1].amount',
ServingUnit varchar(10) '$.servings[1].description',
nutrients nvarchar(max) as json --note that json is case sensitive
)

Give it a go yourself. If you have any question, leave a comment and I will be happy to assist.

If you are exploring Cosmos DB, consider reading my blog about Azure Cosmos DB – Partition and Throughput to get an overview of partitioning and scaling concepts.

jsonfileSample

openjson Example