Pages

Wednesday, November 8, 2023

Behind the scenes: Terraform's Deletion and the Mysterious Auto-Restoration of Azure Ad Enterprise Apps

Context:

A few weeks ago, an unexpected situation unfolded in one of the customer's production environment. It all started when a member of their team decided to pull the trigger on "terraform destroy" command. 

Their intention was to remove a specific app registration from Azure AD that they deployed with a Terraform package, however, little did they know that this (un-tested) action would set off a chain of events that none could have predicted, as the command ended up deleting several other Microsoft's first party enterprise applications from the Azure tenancy. 

Issue:

This activity left the production environment in chaos. You might be wondering why this matters? because, those first-party MS enterprise applications are the backbone of many services within the Azure AD tenant. Their sudden deletion created disruptions throughout the environment, affecting numerous other apps and programs that could be relying on those apps.

Troubleshooting & terraform quirk:

After things calmed down and everyone understood the consequences, the team started looking into what happened. Our mission was clear: discover the mystery of why the "destroy" command had such far-reaching consequences, affecting not only the intended target but also other critical apps of the Azure AD environment.

While the production environment was being restored back to the normal state by manually restoring those deleted enterprise apps, we decided to re-create the scenario in a safe demo tenant of Microsoft. Our experiment worked flawlessly, confirming the destructive behavior, however, it didn't resolve our current issue and answered "why".

Later, we stumbled upon an issue reported in Terraform' s Azure AD provider that highlights the behavior of the destroy command and appears to be a bug. If you have used the setting "use_existing=true" in your terraform code as shown in an example below to set the linkage between your Azure AD app and other SPNs, the destroy command goes  a rampage, deleting not only your app but also every relying linked SPNs it could find, regardless of their origin. E.g. even Microsoft's first-party enterprise applications in this case such as SharePoint Online, Exchange Online, Intune, and MS Teams. 

resource "azuread_service_principal" "sharepoint" {
  application_id = data.azuread_application_published_app_ids.well_known.result.Office365SharePointOnline
  use_existing   = true
}

With that, it explains the question, "why".

Auto-restore Puzzle:

While we managed to re-create the scenario in one of the internal demo tenancy, we stumbled upon a surprising observation as some of the enterprise apps we believed were gone for good started reappearing (without any manual restore operations), and it left us intrigued.

After running short of making guesses, we ended up reaching out to Azure support for answers and they confirmed the existence of automatic restoration and explained its unique behavior. 

When a user accesses services like MS Teams, Exchange, SharePoint, or OneDrive in the tenancy, it triggers Microsoft's underlying services to use the first-party enterprise apps. If it detects any of these apps missing or deleted, it performs the automatic restore of missing apps.

While this is generally the case for most apps, there are a few that don't follow the rule. E.g. Microsoft Intune API, it didn't want to join in this magical recovery process and who knows, there might be more apps with similar behavior hiding within Azure AD's depths. 

Learnings:

So, what's the moral of this story here? well, the lesson is simple. Before running any commands in your production environment, especially when the command name sounds a little scary, think twice or thrice. 

This post also attempts to uncover the myth surrounding this undocumented automatic restoration behavior of Azure AD enterprise apps. Hope it helps someone who is equally surprised to see their enterprise apps automatically restoring without any manual updates.


Friday, September 29, 2023

Using Azure Function's Managed Identity for Service Bus Output Bindings

This short post is intended to share experiences while working on the following scenario:

  • Azure function with managed identity
  • Output bindings configured for service bus queue

Although this may seem straightforward, we encountered some issues in making it work. The difficulties stemmed from the lack of clear documentation on this topic and the dependence on the package extension version used in the solution for service bus connectivity.

While it's commonly understood that a function app needs to define the connection string to the service bus, and this works well when the connection string contains the service bus's secret keys, questions arise when the function app needs to utilize its managed identity for communication with the service bus.

What's the solution?
In summary, sharing the required format for the connection string, which should be present either in your local.settings.json file or in the application configuration of the function app in the Azure portal.


Key takeaways:
  • Note the double underscore in the setting's name; the suffix, 'fullyQualifiedNamespace,' signifies that the function app should utilize its managed identity for communication with the service bus. 
  • When you define the setting in the above format, there's no need to specify the connection property in the output bindings of the attribute in your code. By default, the runtime will search for the connection string using this name i.e."AzureWebJobsServiceBus".
  • If you have the connection name property initialized in your code, the key's name will change to 'yourconnectionnameusedincode__fullyQualifiedNamespace'

For additional reference, you can visit https://learn.microsoft.com/en-us/azure/azure-functions/functions-identity-based-connections-tutorial-2#connect-to-service-bus-in-your-function-app however, please note that the documentation can be somewhat tricky to understand and implement, which is why this post exists.
 
Hope this helps someone! 

Sunday, March 12, 2023

Managing Azure VMs / Arc enabled server configuration drifts made easy with Azure Auto-Manage - Part-1

Managing servers across different environments may be complex, especially when dealing with a range of configurations, security rules, and compliance needs. Thankfully, Azure Arc provides a uniform management experience for hybrid and multi-cloud environments, allowing you to scale the deployment, administration, and governance of servers. 

To further simplify configuration management, you can use Desired State Configuration (DSC) packages to define and enforce the desired state of your server infrastructure and one of the recent offering of Microsoft Azure i.e. Azure auto-manage could help you do it in an efficient way. 

What is Azure Auto-manage?

Azure Auto manage is a Microsoft service that helps to simplify cloud server management. You may use Auto manage to automate routine management tasks for your virtual machines across various subscriptions and locations, such as security updates, backups, and monitoring. This keeps your servers up to current, secure, and optimized for performance without having you to spend a lot of time and effort on manual operations. You can read more about it here - https://learn.microsoft.com/en-us/azure/automanage/overview-about

Note that this blog assumes that you have a basic knowledge about Azure and Desired State Configuration (DSC) in general. If you are not familiar with these technologies, it is recommended that you brush up your skills by going through the official Microsoft documentation at https://learn.microsoft.com/en-us/azure/virtual-machines/extensions/dsc-overview. This will help you better understand the concepts and features discussed in this blog and make the most of the Azure and DSC capabilities.

This two-part blog series will center around the following processes in its first post:

  • Know your pre-requisites 
  • Creating and compiling the DSC file
  • Generating configuration package artifacts
  • Validating the package locally on the authoring machine and check compliance status
  • Deployment options for the package

Scenario: To keep things simple and easy to understand, this post will not create a complicated real-world scenario. Instead, it will use a simple DSC script that ensures that the server always maintains a text file on its disk drive as an example. This will be the state to maintain throughout the post. However, it's important to note that there are no restrictions on referring to this concept and extending your implementation by introducing your own logic in your PowerShell scripts or using different DSC providers, such as registry keys.

The illustration below can be used to visualize the entire implementation workflow, and the following steps will provide a detailed explanation of each.



The following example and steps have been executed and tested on a Windows operating system configuration, however you could use the Linux OS too but a few steps and command might vary. 

Pre-requisites 
Before proceeding with the sample script and implementing the scenario, it is crucial to configure the machine and confirm that all necessary prerequisites are installed on it. This machine is typically known as an authoring environment.

Here are some key artifacts you would need to have in the authoring environment
  • Windows or Ubuntu OS
  • Powershell v7.1.3 or above
  • GuestConfiguration PS Module
  • PSDesiredStateConfiguration Module

Create and Compile DSC script:
Given this context and scenario, let's examine what the DSC script file would resemble.


The script above imports the PSDscResources module, which is necessary for the proper compilation and generation of the .mof file as a result of running the DSC script.

I have observed that individuals with limited experience in DSC often become confused after preparing their DSC script and are uncertain about how to compile it to produce the output, which is the .MOF file. 

To compile the DSC script and generate the .MOF file, you can follow these steps: Open the PowerShell console (preferably PS 7 or higher), navigate to the directory where the DSC file is saved on your local authoring environment, and then execute the .ps1 DSC file.

What is a MOF file?
The .MOF file generated after compiling DSC (Desired State Configuration) is a binary file that contains the configuration data and metadata necessary to apply the desired configuration to a target node. The MOF file is consumed by the Local Configuration Manager (LCM) on the target node to ensure that the system is configured to the desired state.

Generate configuration package artifacts:
After generating the .MOF file, the next step is to create a configuration package artifacts from it. This requires running specific commands to achieve it and as a result, the artifacts are bundled as a .zip file. 

You can run the command below in your authoring environment

Please be advised that there are several command parameters that you should be familiar with. You can refer to the official documentation for a more detailed understanding of these parameters. However, the most critical parameter is the "Type" parameter, which can accept two values: "Audit" and "AuditandSet".

The value names themselves suggest the action that the LCM (Local Configuration Manager) would take once the deployment artifacts are produced. If you create the package with the "Audit" mode, the LCM will simply report the status of the machine if it deviates from the desired state. On the other hand, creating a package with the "AuditandSet" mode will ensure that the machine is brought back to the desired state according to the DSC configuration you have created.

The .zip file will be produced in the directory where your PowerShell console location is currently set. If you are interested in examining the contents of the zip file, you will find the package artifacts similar to the following.


The "modules" directory encompasses all the essential modules needed to execute the DSC configuration on the target machine once LCM is triggered. Additionally, the "metaconfig.json" file specifies the version of your package and the Type, as previously discussed in this post. The presence of the version attribute in this file indicates that you can maintain multiple versions of your packages, and these can be incremental as you continue making changes to your actual DSC configuration PowerShell files.

Validation and compliance check:

After generating the package, the subsequent step involves validating it by running it locally in the authoring environment to ensure that it can perform as expected when deployed to the target machines.

Typically, this is a two-step process where the first step involves checking the compliance of the machine, followed by running the actual remediation.

As mentioned earlier, the second command takes into account the Type parameter value present in the metaconfig.json. This implies that if the package is designed solely for auditing the status, the remediation script will not attempt to bring the machine to the desired state. Instead, it will only report it as non-compliant.

Deployment options:

Before deploying the package to your target workloads, there are a few things you should keep in mind. Firstly, the package should be easily accessible during deployment so that the deploying entity can read and access it. Secondly, you should ensure the presence of the guest configuration extension to enable guest configurations on your target Windows VMs. Additionally, make sure that the target servers have managed identities created and associated.

To ensure that the package is accessible, one option is to upload it to a central location, such as Azure Storage. You can choose to store it securely and grant access to it using shared access signatures. In the next part, we will explore how to access it during the deployment steps. Optionally, you can also choose to sign the package with your own certificate so that the target workload can verify it. However, ensure that the certificate is installed on the target server before starting the deployment of the package to it.

Regarding the second point mentioned above, i.e., ensuring that the target workloads (Azure VMs or Arc enabled servers) have their managed identities, a recommended best practice is to use Azure policy / initiative and assign it to the scope where your workloads are hosted. This policy ensures that all the prerequisites for creating the package deployment, such as performing guest assignments, are correctly met.

Here is the initiative that I have used in my environment and as you can see it contains 4 policies in total that would ensure all the requirements are met before you deploy the package.

Guest Assignment:

After uploading the package zip to the accessible location and assigning the initiative to target workloads, the final step would be to deploy the package. Azure provides a dedicated resource provider, known as the Guest Configuration provider, to assist you in this process through the guest configuration assignment resource. You can read more about it here https://learn.microsoft.com/en-us/rest/api/guestconfiguration/guest-configuration-assignments

You can also access all the guest configuration assignments through Azure portal through a dedicated blade i.e. Guest Assignments  

As the Guest Configuration resource is an Azure resource, you can deploy it as an extension to new or existing virtual machines. You can even integrate it into your existing Infrastructure as Code (IaC) repository for deployment via your DevOps process, or deploy it manually. Additionally, it supports both ARM and Bicep modes of deployment.

As an example - here is what the bicep template of this resource looks like 

While deploying the Guest Configuration resource manually or via DevOps can work, it's recommended to use Azure Policies to ensure proper governance in the environment. This ensures that both existing and new workloads are well-managed and monitored against the configurations defined in the DSC file. In the next post, we will discuss this in detail and leverage a custom Azure Policy definition to create the Guest Assignment resource. We will also explore the various configuration options available.

As we bid adieu to this blog post, let's remember to keep the coding flame burning and the learning spirit alive! Stay tuned for the part 2, where we shall delve deeper into the exciting world of Azure policies and custom policy definitions.

Monday, February 6, 2023

Building a Terraform template to securely push application telemetry to App insights workspace bypassing local authentication

Azure Application Insights workspace is a cloud-based service for monitoring and analyzing the performance of applications. It provides real-time insights into the application's behavior, such as request and response times, user behavior, and error rates. 

In the past, Azure Application Insights was primarily used programmatically through its web APIs or various SDKs by providing an instrumentation key. This instrumentation key was required to interact with the platform and extract insights about the application's performance or query the data stored in it. However, this experience was limited because it lacked native identity authentication, making it challenging to secure the instrumentation key. Developers had to take extra precautions to secure the key and store it, which added an additional overhead to the development process. This absence of native identity identification made the workplace open to possible security breaches and unauthorized access to data.

Recently, Microsoft has made significant changes to Azure Application Insights Workspace to support Azure Active Directory (Azure AD) authentication. This has enabled developers to opt-out of local authentication and use Managed Identities instead. 

By using Managed Identities, telemetry data can be exclusively authenticated using Azure AD, providing a more secure and streamlined way of interacting with the platform. With this change, developers no longer need to worry about managing and storing the instrumentation key securely, as the authentication is handled by Azure AD. This improves the security of the telemetry data and reduces the overhead associated with managing authentication credentials. 

This blog post assumes that the reader has a basic understanding of the Azure Active Directory integration enablement for Azure Application Insights Workspace. If not, it will be recommended that you do the reading on MS learn and know details of it and also take a look at feature pre-requisites. 

The focus of this blog post is on how to configure Azure AD integration using a Terraform template and validate it using a sample .NET web API that talks to the Application Insights Workspace securely using its managed identity when deployed on an Azure Web App. 

Let's take a look at what a terraform template looks like that is responsible for deploying below resources

  • Resource group.
  • App service plan.
  • Web app with it's system assigned managed identity.
  • Log analytics workspace along with app insight resource.
  • Role assignment to grant required permission to the web app's managed identity on the app insights resource.

There are a few key points that need to be focused on.  Firstly, the flag "local_authentication_disabled" must be set to "true" in the Application Insights configuration. This disables local authentication and enables the use of Azure AD for authentication. Secondly, the Azure resource role "Monitoring Metrics Publisher" is a pre-requisite for communication between the telemetry publisher and the Application Insights Workspace. This role must be assigned to the managed identity of the web app resource in order for it to be able to communicate with the Application Insights resource. 

Focusing on these two points will ensure that the Terraform template is set up correctly and the web app is able to communicate with the Application Insights securely using Azure AD authentication.

Now that the Terraform template for configuring Azure AD integration has been discussed, it's time to focus on verifying the setup. The easiest way to do this is to write a sample web API code and deploy it to the Azure Web App resource that was provisioned in the previous step. This will allow us to see if the telemetry data starts flowing to the Application Insights Workspace. 

For this post, a .NET 6 web API project with VS 2022 will be created with minimal code that configures the connectivity between the web app resource and the Application Insights. This project will be deployed to the web app and the telemetry data will be monitored in the Application Insights Workspace to confirm that the integration has been set up correctly.

Here is how the Program.cs of web api could look like - hard-coding in it for brevity
Also, note that in order to integrate the AAD-based authenitcation in your source code, it is important to refer to the correct version of SDKs and for that reason, you might need to install the Application Insights .NET SDK starting with version 2.18-Beta3

Two important points to mention from the sample code above i.e. Firstly, the use of the "ManagedIdentityCredential" provider to perform authentication using the managed identity. This allows the web API to communicate with the Application Insights Workspace securely using Azure AD authentication. Secondly, the connection string contains the instrumentation key and ingestion endpoint.

At this point, it may seem counterintuitive that the instrumentation key is still being used despite the goal being to not specify it. However, the instrumentation key is still required for configuring the connection between the web API and the Application Insights Workspace. The reason the instrumentation key is still used is because it acts as a identifier for the Application Insights Workspace, and allows the "ManagedIdentityCredential" provider to reach the correct resource. The provider uses the instrumentation key to establish the connection between the web API and the Application Insights Workspace.

It is important to note that, since local authentication is disabled in the Application Insights, only Azure AD objects such as managed identities can successfully authenticate to it and the "Monitoring Metrics Publisher" role must be granted to the managed identity in order to allow it to communicate with the Application Insights Workspace.

With this setup in place, you should be ready to start seeing telemetry data from your application in the Application Insights Workspace.

In summary, when working with local authentication disabled in Application Insights Workspace, it is essential to use the "Monitoring Metrics Publisher" role in addition to the instrumentation key in order to publish telemetry data and by following this setup, you can ensure that your telemetry data is securely sent to the correct Application Insights Workspace, while taking advantage of the enhanced security and ease of use provided by Azure AD authentication and managed identities.

Saturday, March 23, 2019

Integrating Azure QnA Maker Service as a Bot Middleware

Recently I have been working with Azure QnA maker cognitive service and was integrating it with bot framework built using SDK v4 and clearly had two choices as an implementation approach
  • Implement it as a regular workflow in bot 
  • Implement it as a bot middleware
This article assumes that you understand bot framework and you are aware of few essential concepts around e.g. bot services, configuration and middleware etc. If you are new to it, then it is highly recommended that you go through this documentation and learn it before proceeding further.

In short, the middleware is a component which sits in between the channel adapter and bot and can see and process each request sent to the bot and any response sent by the bot to the user. 

In a nutshell, the channel adapter is a component which is responsible for converting channel data into a JSON format which can be understood by the bot framework.

As we can see in the image above that middleware sit one after the another in the execution pipeline before the message sent to the bot received by the bot and then gets executed in reverse order when response is being sent to the user by bot and hence the registration sequence of each middleware matters. 

One of the practical examples could be 


Now let’s get to the point and see how we can integrate QnA maker cognitive service as middleware. Why does it make sense to create it as a middleware?

Well, one of the reasons for this decision was because we first wanted to try and get response to any user message by searching inside the QnA repository and if it nothing is fetched then bot core logic can take care of the sent message and run business rules on it.

You can go through the process of creating QnA maker service in Azure portal here and understand how to train it by feeding data, your sources can be public URL of your documentation, tsv files, or excel sheets containing FAQs.

To work with QnA service and to use it inside the bot you will four things
All these can be a part of your .bot file can be initiated at the time of your bot startup in startup.cs, sharing a sample of .bot file with QnA service configuration.


and Startup.cs will look like this


Please note that the code above is just for conceptual reference and is referred from the available samples here. You can find definition of the BotServices.cs on same link.

And now with this, since we have configured the bot with QnA maker service, lets move on to middleware.


As you can see that the source code is quite self-explanatory – all we are doing is implementing the OnTurnAsync method of the IMiddleware interface and simply calling QnA service to fetch results.

Note that this implementation has been kept simple for reference, but you can customize it e.g. processing retrieved response before handing it over the user, log it etc.

Also note that currently it checks for the scores of retrieved results from QnA service and tries to build the top response. You can further configure the 50% threshold value as per your need.

If we receive any response from the QnA service, we are ending the conversation there and handing over the response directly back to user without invoking further components sitting in the pipeline in this sample.

Now the last part is to make bot aware of this middleware by registering it in Startup.cs


And that’s it, now your QnA service is set using a bot middleware.
Hope this helps someone.

Sunday, March 10, 2019

Building Proactive Messaging Bot using Bot framework SDK v4

I have been working with recently released bot framework from last few weeks and one of the interesting requirements I came across was to build a proactive bot which can send messages to users on its own.

Now this sounds like a very straight forward and quite common requirement as possibility of having this capability in bots opens a gateway to number of opportunities in an enterprise environment e.g. sharing live information analytics to power users, sharing caution / warnings to system administrators based on past data etc. however there is a serious limited documentation available on achieving these real-world scenarios specifically with Bot framework SDK v4.

In this post, I am going to share the experience I had while developing a proactive bot and challenges which I faced, approach I took by referring several links and will also share what I learned.

Hitting the road block


To start with, at the time of writing this article - the only documentation and an official sample available from Microsoft is as mentioned below. Please go through these links and understand the essentials involved in getting a proactive bot design working.

Coming back to the original requirement we had which was inline with what MSDN documentation says i.e.
"If the user has previously asked the bot to monitor the price of a product, the bot can alert the user if the price of the product has dropped by 20%"
But the big question was - how exactly do you do that?

Understanding basics


Now to understand this, lets get into the technical details and understand what it essentially means to send a proactive message to the end user from bot perspective.

Proactive conversation is nothing but the event or a message that bot sends to the user and there is one fundamental need here that the user should have at least initiated a conversation with the bot first else it does not make sense that users starts receiving messages from bot which they do not know! 

Now coming to the next point – as we know that overall bot framework is built around the idea of web api and is stateless, hence all the conversations or dialogs or messages which we send to the bot are essentially listened by one web api endpoint (if you have noticed, there is always this default route and an endpoint https://localhost:1234/api/messages) – whenever this api receives any message – it is then processed and response is sent back to the user via specific channel. (channel is just a medium using which user is talking to the bot e.g. teams, direct line or skype). 

The requirement which I was working with needed some trigger from external system to the bot endpoint so that bot can process the incoming signal and then send out a conversation to a specific user.

Let’s understand this using a quick diagram


Note that - to send the conversation proactively to the user – the bot MUST know who the user is and what is his / her conversation reference otherwise it won’t be able to initiate the conversation with the user. 

Approach


Content below is heavily inspired by this github thread and a big shout out to all the contributors on that thread – you guys are awesome!

Bringing it all together:



These are the basic steps we integrated to get the proactive messaging working
  • Record conversation reference of each user / desired users
  • Used Azure table storage to store these records 
  • Introduce api to fetch stored record from storage and generate ProactiveMessage
  • Introduce / Integrate ProactiveController from this post
  • Trigger post call to ProactiveController’s method with payload generated from step 3 and send a message from bot to a specified user
Let’s see each step-in detail

Step 1 and 2:


Because the pre-requisite is bot should know the conversation reference before it can send the message by its own, it becomes essential to have the conversation references stored somewhere.

Since we do not really need a relational data structure here, so we can store it azure table storage, but you can choose to store in relational database table too. 

How do we store?

We can either write a custom middleware and detect if the entry for the user exists in the underlying storage, if it does not – meaning it is a new conversation and we record it. one big problem with this approach could be – since it is a middleware – it would be invoked every time for each message and might impact performance. 

Thankfully, in our bot since we have integrated Azure AD authentication – we knew exact place where we can write this storing logic i.e. whenever user gets authenticated. 

This is how my UserEntity look which I am storing in the table storage


Instance of this class needs to be initiated from the incoming first user request which you would find in the OnTurnAsync method


Step 3:


Because we have now stored the user entity (conversation reference) in the table storage – we will now need the api which can query the storage based on inputs provided.

E.g. if I am storing user name in FromId property, then I would need the api to generate Activity Payload for that specific user so that we can send it further to the ProactiveController which takes care of beginning the proactive conversation with that user.

I have added a controller in the same bot project which returns the collection of ProactiveMessages
Here my definition of ProactiveMessage


And ActivityGeneratorController in short


And here is the sample output it generates


IMPORTANT NOTE:
This took a while for me to understand and hence sharing. The botId and needs to be the Microsoft App Id of your bot and then only it works.

Step 4 and 5:


Finally, you should integrate the ProacticeController in your bot project. 
I am not going to share that source code here again because there is already a git project for it.

Validation:


Once this set up is completed, and if you have managed to generate the activity payload correctly by recording correct user conversation references, you can try sending the HttpPost request to the ProactiveController externally from any request generator like Postman and you should see the proactive message with your text appear in conversation of the user to which it was sent. Note that in above Json example, the message with text “Test Message “will appear in user’s team channel.

Update


The approach mentioned above stopped working in specific scenarios e.g. when the user to which message needs to be sent becomes idle on channel then proactive messaging started throwing exception below

Microsoft.Bot.Schema.ErrorResponseException: Operation returned an invalid status code 'Unauthorized'
   at Microsoft.Bot.Connector.Conversations.ReplyToActivityWithHttpMessagesAsync(String conversationId, String activityId, Activity activity, Dictionary`2 customHeaders, CancellationToken cancellationToken)
   at Microsoft.Bot.Connector.ConversationsExtensions.ReplyToActivityAsync(IConversations operations, String conversationId, String activityId, Activity activity, CancellationToken cancellationToken)
   at Microsoft.Bot.Builder.BotFrameworkAdapter.SendActivitiesAsync(ITurnContext turnContext, Activity[] activities, CancellationToken cancellationToken)
   at Microsoft.Bot.Builder.TurnContext.<>c__DisplayClass22_0.<<SendActivitiesAsync>g__SendActivitiesThroughAdapter|1>d.MoveNext()

and this puzzled us, again there is a lack of documentation and information is scattered around different links and blogs. We came across one helpful link which gave some hints and then we had to make changes in out ProactiveMessagingController logic which now looks like this



Note that we are now creating channel account and initialing an instance of MicrosoftAppCredentials with bot's app Id and secret (typically you would find this in .bot file in your solution, if not you can always get it from bot registration portal) and then with the help of ConnectorClient we are not beginning conversations with specific user.

Hope this helps someone looking to build the proactive messaging using bot framework SDK v4 and with this one can trigger targeted proactive messages externally to specific user with notifying them with relevant message.

Saturday, March 2, 2019

Implementing database per tenant strategy in bots using Azure bot services and SDK v4

Recently I have been working with building the bot with latest bot framework SDK i.e. v4 and the aim was to provision the bot as a SaaS. Meaning that, we already had dedicated SQL Azure database instances provisioned for each tenant (which are also consumed by other line of business applications) and there was need to create the bot which will be hosted with Azure bot service and authenticated by Azure Active Directory. Whenever user logs in to the bot, based on the identity of logged in user the bot should dynamically redirect all the subsequent data read and write requests to correct SQL Azure database instance of the user.

This article does not focus on writing an end to end bot covering scenario above (but can be a good candidate for later) and makes some assumptions that you already have some knowledge of latest bot framework and are familiar with fundamentals of it along with .NET core and Entity Framework core. If you are new to it, then it is highly recommended that you go through documentation here and then continue reading the further part of this article.

The basic architecture and overall Eco-system looks like this


Note that the sole purpose of sharing the above diagram is to give you a high-level overview of the solution and to make you aware that where exactly bot service is coming in to the picture. You can choose to ignore the data preparation and reporting part which is at the right and involves ADF and Power BI.

To keep the content of this article focused on our scenario, we will not go in to the details of how bot can be configured to use Azure AD for authentication and blob storage for state management etc. You can find the documentation to do it on this and this link.

One quick point to mention here is, the Azure AD app needs to be configured as multi-tenant app so that users from different active directories would be able to get themselves authenticated.

Now coming back to the problem of redirecting requests to right tenant based on the user identity, let’s see how that can be done.

Typically, when we want to use the EF data context object inside the ASPNET core application, we end up taking help of DI and inject it in Startup.cs which looks something like this


But this does not help here, Why? Is because at the time of the application startup – you do not know which database we want to point to.

DbContext object should be initialized at the run time once we know that who is the user and where is the relevant database for that user.

Now, to know where to send data request of users – we need to have some sort of mapping present between users logging in to bot and respective database connection strings. This mapping can be done in number of ways e.g. maintain mapping of user’s domain and connection strings or maintain mapping of user’s tenant id and connection strings.

You can choose your own store to have these mappings provisioned i.e. store it either on central SQL azure database table or maintain it in XML, JSON or in configurations or in Azure key vault.

There is an excellent documentation available by Gunnar Peipman explaining this scenario, you can have a look at it to understand the details of how it uses json file to initialize different mappings

So functional flow looks like this



Runtime Initialization of the DbContext object can be easily achieved by overriding OnConfiguring method of the DbContext class.


With this approach, there is not really a necessity to inject the DbContext at the time of application startup and instead you can initialize DbContext explicitly whenever you need. Note that every time you request or create your DbContext object – the OnConfiguring method will be called.

Extending the model


How do I get the connection string at the run time?

The answer is, you can create your own connection string provider which will return you connection string based on user and will look up relevant connection string either in Sql database, or in configuration or in azure key vault.

For our specific scenario, since our bot is AD authenticated – once the user is logged on, we know the tenant id of the user. We store that tenant id in custom class i.e. UserProfile and store it in the bot user state so that it can be accessed during any conversation.

As a mapping between tenant id of the user and actual connection string, we do use the Azure key vault to store connection string value as a secret. You can refer how to configure azure key vault connected services aspnet core in this article - Add Key Vault to your web application by using Visual Studio Connected Services

The reason tenant connection string provider comes handy when tomorrow you decide to maintain mappings in some external secure service other than key vault in that case all you would need to do is, replace the implementation of tenant provider and that’ll be all.

Here is how tenant connection string provider can look (fetching connection strings from Azure key vault)


Introducing DbContextFactory


It always is a good approach to introduce the factory pattern to achieve the additional isolation between the requester and object creator.


Note that how we have created our own implementation of CreateDbContext method instead of implementing the default method which only takes string array as input.

Also note that how we have consumed the ConfigurationTenantProvider which implements TenantProvider’s GetTenant method and is configurable. We could have injected the ITenantProvider in factory but for this example, just wanted to keep the setup simple and hence initialized it directly.

Let’s understand it with the help of a block diagram


Startup configurations


The last part would be to make DbContextFactory available to all the dialogs i.e. by injecting it to the container.


and since we now have registered the DbContextFactory to the DI container, you should easily be able to access it in any bot dialog and should be able to create your DbContext dynamically whenever needed.



Summary


The latest bot framework and EF core is quite flexible, approach explained above is one the way to make your db context dynamic and there could be multiple ways to do the same. Readers are encouraged to refer it as an option and customize it further as per their needs.