TLDR; Look at what your GUI does. Automate that. And you’ve implemented “APP as API”
A phrase I’ve found myself using in various training sessions is “App as API”.
I think I’ve mentioned that in some previous blog posts but want to expand the concept here.
- Some applications provide an API
- Most applications provide a GUI
When the application provides an API, I like to take advantage of that. But…
- Sometimes the API doesn’t do everything I want.
- Sometimes the functionality I want to use is only available from the GUI
Do I let that stop me?
No.
So what do I do?
What do I do?
I build the API that I want to use. And I’ll abstract the GUI and API behind that.
So I might have:
- API-I-want-to-use
create_user(name, username, password)
login_user(username , password)
change_password(username , new-password)
create_project(username , project_name)
And this is an example and it is a first draft, so it is pretty crude. As the API grows I’ll refactor it into different objects and a better hierarchy, but this is a start.
create_user
- has to delegate to an API abstraction since only the API allows me to create users
login_user
- I can login via the API or via the GUI and both return a valid session id, but only the GUI sets the session cookie so I start by using the GUI automation
change_password
- I can change password from the GUI and the API, so I pick the API
create_project
- the API doesn’t support creating projects so I use the GUI
Fairly simple. I’m building an abstraction layer that supports what I need to do, but I don’t really care how it is implemented.
At least to start with.
Over time we decide that the GUI usage when we automate is a pain. It keeps spawning browsers and sometimes, because of the work we are doing, the browsers are left open. So we decide to change. We want to stop using a tool that automates the GUI.
But…
Then I can’t create projects.
What do I do?
I could:
- inject projects directly into the database,
- but there is a risk of referential integrity issues and
- we have some sort of painful permissions process to gain write access to the database and yada yada yada the project admin is getting in the way.
(Insipired by actual events)
What else could I do?
I can treat the APP as an API
Instead of thinking of the GUI as a ’thing’. I’m going to view the GUI as an abstraction.
And what does that abstraction do?
Well, it sends HTTP requests to the backend when I fill in a form.
Therefore we could automate the sending of the HTTP requests, rather than automate the GUI?
Yup.
How?
By using the GUI, and passing it through a proxy, we can see what requests the application issues to fulfil its contract as an API when it “creates a project”.
And lo, we discover that it issues a number of HTTP requests:
- create a project
- add the user to the project
- create a default ‘context’
- add the context to the project
Well that explains the database referential integrity issues.
It also reveals a ‘risk’ at the GUI level we never knew about. What if only some of the requests make it through?
Perhaps that explains some of the referential integrity issues we’ve been encountering in the database when the system is used under high load?
Regardless.
We’re going to replicate the HTTP requests as the implementation in our API
So I take the ‘session id’ from the API login, and add that to the headers in the HTTP requests and now I no longer need to use the tool that automates the GUI when I use this “App as API” mode.
“With great automating comes great responsibility and potentially increased risk” I think Peter Parker’s Uncle (tm, all rights reserved, fair usage clause invoked, no copyright violation intended) said something like that.
We have to take responsibility for the fact that we have automated this way.
And by responsibility I mean we implement a mitigation strategy
To mitigate the risk that the GUI changes, and the HTTP requests change, and our abstraction layer no longer matches the GUI we will:
- have code that uses the GUI library
- starts up a code controllable proxy e.g. (https://bmp.lightbody.net/)
- uses our “App as API” in “full GUI mode” to login and create a project
- as it does so, the requests are captured in the proxy
- we compare the requests the proxy captures with the requests we encoded in our API
- we assert that the requests are the same
If we do the above, then our risk mitigation test will fail if the GUI changes, but we haven’t updated our code.
Hey, as a side-effect, if we run this often, then we are also covering the risk that the GUI might not be able to create a project.
Cool.
And that’s what I mean by APP as API
Instead of defaulting to using the APP. Given that automating the GUI can be slower than issuing HTTP requests.
- We start to understand the application more
- We make our automated code less ‘in your face’ (i.e. no browsers popping up)
- We can use this to support our testing interaction with the app because the API is defined at a task level that we understand for testing rather than the API that the designer wanted to expose
- We identified risks that had gone unnoticed because we started testing at a more technical level
- And thousands of other benefits that we’ll realise over time (or we might not, the above might be enough)
When we bypass GUI controls, and use ‘unofficial’ interfaces, we might increase risk.
In some cases we can easily automate the continual monitoring of those risks.
I often do this when I’m automating 3rd party applications because they either don’t give me an API or don’t expect an end user to want to do what I want to do.
I’ve used this on projects when the automated executions started taking too long, or failed because of ’tool’ interaction issues.
And now you can experiment with your APP as an API.