First pass of the new pages

Fixing table Fixing typos other things found in review Signed-off-by: Mary Anthony <mary@blockstack.com>
6 years ago · 49f73e21d9
5 changed files with 91 additions and 86 deletions
--- a/_data/navigation_learn.yml
+++ b/_data/navigation_learn.yml
@ -34,6 +34,7 @@
  docs:
  - develop/mining_intro
  - develop/app-reviewers
+  - develop/app-rankings
  - develop/mining_enroll
  - community/app-miners-guide
  - develop/appMiningFAQ
--- a/_develop/app-rankings.md
+++ b/_develop/app-rankings.md
@ -0,0 +1,40 @@
+---
+layout: learn
+permalink: /:collection/:path.html
+---
+# How scores become rankings
+{:.no_toc}
+
+{% include mining-ranking.md %}
+
+* TOC
+{:toc}
+
+## Z- Scores
+
+First Blockstack determines a `z-score` for each ranking. This is a statistical technique to account for different distributions of scores within categories.  The following formula is used to calculate a `z-score` for an app.
+
+`if(App score=0, -1, (App score -average App score:App score))/stdev(App score:App score))`
+
+After computing the z-scores, Blockstack considers the app category. It computes the average of that category's z-scores and the standard deviation of each category.
+ 
+## Theta-Scores
+
+For each app’s score in a category, Blockstack determines how many standard deviations it is away from the average score in that category. Theta scores standardize reviewer results so they can be compared to other app reviewer data. Theta-scores are calculated by this formulat
+
+`if(App’s Avg. Z-score > 0, App’s Avg. Z-score^0.5, -(ABS(App’s Avg. Z-score)^0.5))`
+
+For example, let’s say a category has an average score of 60, with a standard deviation of 15. A score of 90 would get a z-score of 2, because it’s 2 standard deviations higher than the average. 
+
+Once each app has a calculated a z-score in every category, the average of those 4 z-scores results in a final number. A higher number is better than a lower one, and so apps are ranked from highest to lowest.
+
+
+## Final Results
+
+The final results are determined by first averaging the Theta values of the app’s scores, then applying a history score. App mining weighs past results of the program alongside new results to track improvements and give weight to each months rankings. 
+
+Apps that have been in the program for a month consider 25% history score (score last round) and 75% their average score this month. Here is the equation that is used:
+
+`if(Score last round=0, New Average Score, (0.75*New average score+0.25*Score last round))`
+
+To see all the formulas in action, please see the results and formulas used in [the full sheet of App Mining audit results](https://docs.google.com/spreadsheets/d/13PXIJhEhTusjVT9elYS3LnGqSj6DBjTUDCzB_R6Inkw/edit?usp=sharing).
--- a/_develop/app-reviewers.md
+++ b/_develop/app-reviewers.md
@ -2,7 +2,7 @@
 layout: learn
 permalink: /:collection/:path.html
 ---
-# How third-party reviewers operate
+# How apps are reviewed and scored
 {:.no_toc}

 Blockstack uses third-party reviewers who interact with, review and provide scores for the apps in the App Mining program. The scores from each reviewer are used to determine an app`s ultimate rank. In this section, you learn more about the reviewers and how they score.
@ -16,80 +16,65 @@ Product Hunt is the place to find the best new products in tech. They have a mas

 Their community score is determined only by the number of **credible** upvotes an app received on Product Hunt, relative to other apps that are registered. For example, if an app got more upvotes than any other app in a cohort, their community score would be 100. If a different app got 60% as many upvotes, they’d get a score of 60. Blockstack converts these scores into *z-scores*.

-## Digital Rights Reviewer 
+More info on the Product Hunt launch component and recommendations in the [Application Miner's Guide to Promoting your App]({{ site.baseurl }}/community/app-miners-guide.html).  

-As a "digital rights reviewer," New Internet Labs reviews apps submitted to the app mining program based on the degree to which the apps respect and protect users` fundamental digital rights. For these review, the use of Blockstack Auth and Blockstack IDs are evaluated. The reviewer checks if Blockstack Auth is:
-
- the only auth method
- one of multiple methods (presented equally with all others)
- a secondary auth method
- not used at all
-
-
-{% include note.html content="The digital rights reviewer does not provide an exact testing environment. Instead, the reviewer tests various environments changing month to month. Reviews are conducted on the latest version of macOS, Windows, iOS, or Android with either the platform's default browser or the latest version of Chrome or Firefox." %} 
-
-If your app fails a test, by definition you`re not eligible for App Mining since your app does not have working Blockstack Auth. For more detail [here](https://github.com/blockstack/app-mining/blob/master/DigitalRightsAuthScoringCriteria.pdf). 
-
-The digital rights reviewer also checks for an app`s use of Gaia Storage System. Each app is given a rating based on which category it falls into:
-
- doesn`t use it at all
- uses it for some things (some data is stored elsewhere, or the reviewer is unable to determine if some critical user data is sent elsewhere)
- data is only stored in the Gaia Storage System
-
-## Awario 
-
-Awario is a tool that brands world-wide use to better understand the online "conversation" surrounding their brand and to collect meaningful data with which to grow it. If you are interested in using Awario independent of App Mining, please [see Awario’s documentation](https://awario.com/help/) as it answers most questions about how the platform works.
+## TryMyUI

-Awario provides Blockstack with an awareness score. App Miners do not receive a score from Awario in their first eligible month. The first month Awario instead spends evaluating, honing, and updating an apps' brand data queries. The purpose of this is to zero in on only relevant data. 
+TryMyUI provides 1.5M application testers around the globe, and apps enrolled in app mining are reviewed by 10 users each month and rated on the following dimensions:

-### Awareness scoring
+* Usability
+* Usefulness
+* Credibility
+* Desirability

-At a high-level, Awario focuses on two major aspects of awareness mentions and reach.
+TryMyUI drops the highest and lowest test scores and uses the middle 8 scores each month for the rankings and calculates and average of scores for each component. TryMyUI has it’s own “history” component to increase reliability of the tests. On average, projects need around 20 user tests to get actionable and reliable feedback. TryMyUI provides a monthly score that reflects 75% of the new month’s score, and 25% from last month’s score. The calculation to find that is: 

-<table class="uk-table">
-  <tr>
-    <td>mentions</td>
-    <td>Captured mentions of a brand or app online, on social networks, and on news sites. A mention is registered when the name the brand or app appears publicly, for example, a tweet mentioning the app name.</td>
-  </tr>
-  <tr>
-    <td>reach</td>
-    <td>The estimated reach of the combined mentions collected for your brand, for example, how far the tweet about said app traveled online.</td>
-  </tr>
-</table>
+```
+X = raw score of new month
+Y = final (rolled) score from previous month

-Mentions are captured and provided to App Miners. For App Mining, the focus is on reach which is the less gameable and thus more suitable for App Mining. For example, it would be reasonably easy to create many fake individual mentions (for example, a Twitter bot), but it would be unlikely that those fake mentions generate much if any actual reach. Using reach, Awario provides beginning in the second month of an app`s entry, a **Reach Score** and a **Growth Score**.
+0.75X + 0.25Y = new month’s final score
+```

-An app's **Reach Score** is based on the total reach of all eligible mentions for the previous calendar month. A score is `log10(total_reach)`. So, if an app reaches 10 people, the score is 1, 100 is 2, 1000 is 3, and so on. A `log10` calculation is much better than only using the actual reach because outliers would totally skew the distribution. No matter what an app's reach is, to improve 10x its reach only nees to increase this by 1. Using `log10` is also similar to how Blockstack handles the `theta` function in the mining algorithm, because the higher a score, the more an app needs to improve to bump its score.
+TryMyUI tests occur from the beginning to middle of the month, and Blockstack PBC cannot provide exact timing of the tests. App founders should not make any breaking changes to the app during this time. TryMyUI testers are English speaking. TryMyUI provides niche audiences based on the type of app.  Founders can take this brief survey to fill out their preferred audiences.  Read more about TryMyUI Scoring and recommendations in our [Application Miner's Guide to Promoting your App]({{ site.baseurl }}/community/app-miners-guide/app-miners-guide.html#recommendations-from-trymyui). 

-A **Growth Score** is the month-over-month (MoM) growth in an apps total reach (not `log10`). If an app went from a reach of 1000 to 1500, its MoM growth is 0.5 (or 50%). 
+## New Internet Labs 

-Like all the other reviewers, the z-score is first calculated for each of these metrics, and then averaged. Then, the theta function is applied, resulting in a `final` Awario score.
+New Internet Labs ranks apps based on their use of Blockstack Authentication (Auth) and Gaia as a measure of Digital Rights. New Internet Labs may test on any browser or device of their choosing for web apps, and the OS that matches the app if it is mobile. 

-### What counts as reach
+For authentication, the scoring criteria follows: 

-The Awario team directly builds the query and search parameters for mention alerts. This leverages their expertise on their own platform. For unique brand or app names, the query is fairly straightforward, the name is loaded into Awario and it begins crawling sites and networks for public instances of it. For common names or names where context is important, such as a brand name like Stealthy, it is  important to filter out non-relevant results like someone simply using the verb ‘stealthy’. For these cases, Awario sets up much [more complex queries](https://awario.com/help/boolean-search/boolean-syntax-and-operators/) to trim it down to the ones actually related to the project. 
+| Rating | Blockstack Auth|
+|---|---|
+| `4` | Is the only authentication method.|
+| `3` |  Is the primary authentication method.|
+| `2` | Is one of many authentication methods. |
+| `1` | Is of secondary importance among many authentication methods.|
+| `0` |  Is not used at all.|
+| `-1` |  Is not used at all.|

-Websites are also excluded from reach totals. Such exclusion is generally pretty accurate and useful in a normal business use-case, it can be gamed because of the way Awario estimates the website reach. The site`s Alexa rank is used in the reach calculation meaning, for example, a mention on Github would register as massive reach, even if that particular mention didn’t really spread that far.

-Mentions and associated reach from Blockstack accounts are not counted. This allows Blockstack PBC to support apps publicly, without worrying that it needs to be evenly distributed across apps, which simply isn’t possible. Also not counted are the handles of all Blockstack PBC employees and all official Blockstack PBC managed accounts such as `@blockstack`. This is so everyone at Blockstack PBC can continue to freely support applications without running the risk of unintentionally biasing the results.
+For Gaia, the scoring criteria follows: 

-### Mention auditing
+| Rating | Gaia|
+|---|---|
+| `1` | Is used. |
+| `0` | Is not used or the reviewer could not determine.|
+| `-1` | Is used but is broken. |

-As part of the monthly process for generating Awario scores, Awario also helps to audit the mentions coming in. This audit ensures that mentions that shouldn’t count don`t. Awario is confident in their ability to only collect and count relevant mentions through their search operators. The Awario teamis also available to answer and questions or concerns App Miners may have. Last, with Awario’s platform, it is extremely easy to remove individual mentions (and thus the associated reach count) or to blacklist any accounts found to be fraudulent or accounting for false-positive mentions.
+New Internet Labs provides these raw scores to Blockstack. Apps that are found ineligible by New Internet Labs due to having an error in Auth are disqualified from app mining.  

-## TryMyUI
+## Awario 

-TryMyUI’s panelists score using a special survey they developed expressly for the App Mining program: the ALF Questionnaire (Adoption Likelihood Factors). Desktop, iOS, and Android versions of apps are tested as they are applicable.
+Awario providees data about app awareness through measure of reach, mentions, and growth. Awario rankings start counting at the start of the month the app was submitted, and the results are incorporated into the rankings the month following. This means that any data used in calculations is from the previous month. New apps to the program do not incorporate this score on their first month of being enrolled. 

-Answers to this questionnaire will be used to calculate an overall score reflecting the following 4 factors:
+Awario provides the following information for ranking: 

-* Usability
-* Usefulness
-* Credibility
-* Desirability

-Each factor corresponds to 4 questionnaire items, for a total of 16 items that comprise the ALFQ. Users mark their answers on a 5-point Likert scale, with 5 meaning **Strongly agree** and 1 meaning **Strongly disagree**. The final result is a score for each of the 4 factors, and a composite ALF score.
+- Reach is the estimated online *Reach of the Mention* collected on the brand. For example, reach would include how many impressions a tweet mentioning the app actually generated. 
+- The log of the `reach` score is calculated by `= if(X2=0, "", log10(X2))`
+- The Awario growth score is the percentage change in total reach from the last month to this month (`NOT log10`)
+- Awario Average is calculated by `reach/growth Z`. If `growth` is omitted, then this is just `reach z`.

-<img src="images/alf-score.png" alt="">
+Blockstack publishes the Awario data sheet with all app mentions for auditing at the start of the audit period. Blockstack PBC employee social media accounts are omitted from the reach scores. There is also a manual scan of Awario data to remove any data suspected of a false match.

-For example, consider an application that is both Android and iOS. Each platform version receives 4 tests of each. In total, 8 user tests are created, the highest and lowest scores are dropped. App developers receive the raw TryMyUI scores. The App Mining process calculates Z scores for each category. As a result, the TryMyUI results in the App Mining scores differ from raw scores visible in an app`s TryMyUI account.
--- a/_develop/mining_intro.md
+++ b/_develop/mining_intro.md
@ -14,42 +14,20 @@ This section explains App Mining, a program for developers. For Blockstack, App

 {% include intro-appmining.md %}

-## How apps are reviewed
+## How apps are reviewed and ranked

-Blockstack worked with a team of Ph.D. game theorist and economists from
-Princeton and NYU to put together a [ranking
-algorithm](https://blog.blockstack.org/app-mining-game-theory-algorithm-design/)
-which is fair and resistant to abuse. Blockstack uses the third-party
-reviewers: Product Hunt, Awario, TryMyUI, and Democracy.. These reviewers are
-independent, and generally rely on their own proprietary data and insights to
-generate rankings.
+Blockstack uses the third-party reviewers: 

-To learn in detail about the reviewers' methods, see the page on [who reviews apps](app-reviewers.html).
-
-## Reaching the final scores
-
-Once the reviewer-partners generate reviews, each app has 5 raw scores between 0
-and 100 for the following:
-
-* Product Hunt team score
-* TryMyUI ALF score
+* Product Hunt
 * Awario
-* digital rights review
+* TryMyUI
+* Internet Labs. 

-First Blockstack's determine a ‘z-score’ for each ranking category community,
-team, likability, and traction. This is a statistical technique to account for
-different distributions of scores within categories. Second, Blockstack computes
-the average and standard deviation of each category. Finally, for each app’s
-score in that category, Blockstack determines how many standard deviations it is
-away from the average score in that category.
+These reviewers are independent, and generally rely on their own proprietary data and insights to generate rankings.

-For example, let’s say a category has an average score of 60, with a standard
-deviation of 15. A score of 90 would get a z-score of 2, because it’s 2 standard
-deviations higher than the average.
+{% include mining-ranking.md %}

-Once each app has a calculated a z-score in every category, the average of those
-4 z-scores results in a final number. A higher number is better than a lower
-one, and so apps are ranked from highest to lowest.
+Read more about [how reviewers score your app](app-reviewers.html) and [how your app ranking is calculated](app-rankings.html).


 ## Determining how much an app is paid
--- a/_includes/mining-ranking.md
+++ b/_includes/mining-ranking.md
@ -0,0 +1 @@
+After the reviewer-partners generate reviews and scores, Blockstack uses Z-scores and Theta scores to standardize the scores across application categories. Blockstack worked with a team of Ph.D. game theorist and economists from Princeton and NYU to put together a ranking algorithm which is fair and resistant to abuse.
			`@ -0,0 +1 @@`
			`After the reviewer-partners generate reviews and scores, Blockstack uses Z-scores and Theta scores to standardize the scores across application categories. Blockstack worked with a team of Ph.D. game theorist and economists from Princeton and NYU to put together a ranking algorithm which is fair and resistant to abuse.`