Store Presence on App Store #1 – Let’s Play: Scrape

This is #1 of the how-to series on custom store presence analysis and plotting using Python, Jupyter and lots of sciency-graphy libraries.

#1 - Let's Play: Scrape    <<
#2 - Let's Play: Cleanup   [ DONE ]
#3 - Let's Play: Optimize  [ DONE ]
#4 - Let's Play: Analyze   [ TODO ]
#5 - Let's Play: Visualize [ TODO ]
#6 - ...

Preparing a business plan requires systematic procrastination: Store presence analytics!

All work and no play makes Jack a dull boy.

Preparing a business plan requires fourteen things: a business, a dozen cups of coffee and systematic procrastination.

I am sure you can handle the coffee and the business. So for now, I will only try to help with the procrastination:

I need some insight into the App Store presence of our latest game, Twiniwt. I visit App Annie’s Featured page, as usual.

If you don’t know about App Annie, this is a good review of the service.

Analytics you find on such dedicated services are good as general performance metrics.  However, you might want the data to support a particular claim in your business plan. The freely available content is hardly useful for that. The signal to noise ratio makes it hard to read. Besides, you can only filter the data.

We have to do some ad-hoc data science and visualization in order to get better results. Now, what was data science, again?

I live in Istanbul and I prefer a budget GNU/Linux laptop at work. Obviously, I am no data scientist. Still, some half-assed data science is better than none.

Let’s play: STORE PRESENCE ANALYTICS!

This will require at least five steps:

Scrape
Gather the data and understand its current structure
Cleanup
Restructure and delete irrelevant fields of the data
Optimize
Optimize the restructured data to a format that is faster to batch process
Analyze
Do filtering, mapping, grouping and analysis over data
Visualize
Plot and visualize the data frames
We will dedicate a seperate post to each of those titles.  The data analysis and visualization steps can actually grow to multiple posts as we go.
We won’t delve into the subject of integration, yet we will rely on cross-platform, portable data formats. Besides, we will mainly use Python, which is a very sound choice for integration.

We won’t be concerned about storage either, as the data will already shrink significantly while we are optimizing for performance.

Before we begin, let’s look at some coffee beans and get hyped!

SCRAPE

First things first. If I want to analyze store presence data, I need the store presence data. I need it in a way I can process it. I am already familiar with App Annie’s daily feature tables. Here we go:

Go to your app’s page.

Go to User Acquisition >> Featured.

This is good stuff, even though the data itself is not. 😒

App Store Presence Feature Table
WTF are those app positions! 1714? Seriously?

Anyway.

When I click the Export button, I get this.

connect
Connect this app? Hmm, why not!

Now, I am sure App Annie’s premium solution provides various great viewpoints  and customizations to the data. However, being an indie studio and all, we cannot afford it.

Hey, there is another way! I can connect my app. But what do I see when I click connect? It is asking for my App Store developer account password. I don’t care what encryption they use, there is no way I am filling those Connect forms asking for my developer account passwords.

So, I guess I am on my own, but that is OK.

Be warned that I do not know if App Annie ever intended or actually permitted the use of data they transfer to client computers in such way. So I will not tell you what to do, I will just describe how one would access such freely available data in a structured data format.

Why not having a closer look at the transferred data in Firefox? Right click anywhere in screen and click Inspect Element.

Go to Network tab.

App Annie Featured Page

Now this is where you can track requests and responses. Whenever you change the table options, a new set of data will be sent to you, the client, in JSON format.

The entry that contains the data you asked for is something like this:

/ajax/ios/app/your-app-name/daily-feature...

That is the app’s daily store presence aggregated according to your preferences.

Saving Your Data

When you need to save a piece of data a web server sent you, you simply right click the corresponding entry in Network tab, then do Copy >> Copy Response.

Open your favourite text editor and paste the clipboard to a new JSON file.

If you want to quickly do that for multiple data sets, you can instead Copy >> Copy as cURL, and modify the date etc. in copied curl command.

DATA STRUCTURE

Let’s check out a sample store presence in an App Annie Daily Featured response.

[
  [
    {
      "image": "https://static-s.aa-cdn.net/img/ios/...",
      "type": "icon",
      "thumb": "https://static-s.aa-cdn.net/img/ios/..."
    }
  ],
  [
    "China",
    "CN"
  ],
  "iPhone",
  "Board",
  "Collection List",
  "N/A",
  2,
  4,
  6,
  [
    {
      "existence": false,
      "detail": null,
      "label": "Featured Home"
    },
    {
      "existence": false,
      "detail": null,
      "label": "Board"
    },
    {
      "existence": true,
      "detail": {
        "position": [
          6
        ],
        "row": [
          4,
          4
        ]
      },
      "parent": "免费",
      "label": "Twiniwt"
    }
  ],
  [
    "N/A",
    0,
    100,
    ""
  ]
]

Now, compare it to the table, not too carefully, though. (I couldn’t bother finding the exact same record. 😎)

App Store Presence Feature Table
1714? Why do you even put that into the table? That is just wasted network bandwidth.

Below is a breakdown of the row contents:

Creative [GARBAGE]
URLs to store creative for the app.
Country
The App Store country name as well as the two-letter
ISO 3166-1 alpha-2 code for it.
Device
iPhone or iPad
Category Page
The featured category page where the placement is displayed.
Type
The feature type of the final placement displayed in the app store.
Subtitle [GARBAGE]
The text shown alongside feature banners and collection titles.
Depth
The number of steps necessary to see the final feature placement.
Row [DUPLICATE]
The final row number along the path leading to the feature placement.
Position
The final position number along the path leading to the feature placement.
Feature Path [DIRTY]
A detailed path of where the feature placement was shown in the app store.
Premium Content [GARBAGE]
Locked additional content, only activated for premium account queries.

Here is the cleaned-up version of the store presence data we want to work with.

[
  "CN",
  "iPhone",
  "Board",
  "Collection List",
  2,
  [
    4,
    4
  ],
  6,
  [
    "Home",
    "Board",
    "免费"
  ]
]

Awesome, but enough with the data scraping for today. I will show you how to cleanup and restructure the JSON data to fit our needs in next part.

Check back for Part 2 of the series! If you have any questions or advice, please comment below. Thank you!

Related Links

Check out the following official App Store guide to  building your store presence:

Make the Most of the App Store

Published by

Kenan Bölükbaşı

Founder, Project Leader and Developer at 6x13 Games. Game Developer & Designer, CG Generalist, Architect. Theoretical and applied knowledge in programming, design and media. Broad experience in project management. Experience in 3D (mesh, solid & CAD), 2D (raster, vector), and parametric graphics as well as asset pipelines and tools development. Blender 3D specialist (Blender Foundation Certified Trainer).

One thought on “Store Presence on App Store #1 – Let’s Play: Scrape”

Leave a Reply

Your email address will not be published. Required fields are marked *