docs/_site/newinternet/managing-data-with-gaia.html


								<p>If you’ve gone through tutorials and documentation for blockstack.js and Gaia,

								you’ll know the <code class="highlighter-rouge">blockstack.js</code> interface is dead simple. First, you

								authenticate a user into your app. Once that’s complete, you’re free to read and

								write app data in the user’s storage provider with two data operations:</p>


								<ul>

								  <li><code class="highlighter-rouge">putFile</code>: Writes to a specified path</li>

								  <li><code class="highlighter-rouge">getFile</code>: Gets the file at a specified path</li>

								</ul>


								<p>That’s it. You’re reading files and you’re writing files. All file types are

								supported, so you can choose to manage data with sql, markdown, json, or even

								your own custom format! Gaia operations are purposefully left primitive so that

								you have complete control over which tools you use on top. In the future, we

								imagine a variety of data management libraries will emerge that wrap Gaia and

								help you interact with your data layer via expressive APIs.</p>


								<p>If you’re anything like most developers, you’re probably used to working with

								highly abstracted libraries that offer collection management, querying,

								pagination, documented schema models, and more. Developing apps on Blockstack is

								thrilling because you quickly learn that you don’t need training wheels. You can

								create a meaningful and complex data layer using two methods: <code class="highlighter-rouge">putFile</code> and

								<code class="highlighter-rouge">getFile</code>. This limited interface forces you to think about your fundamental

								data architecture and make some decisions about how you’re modeling data to gain

								back the benefits you get with large frameworks.</p>


								<p>This series will focus on teaching you to think like a Blockstack developer

								working with Gaia. Let’s get started!</p>


								<h2 id="working-with-data-collections">Working with Data Collections</h2>


								<p>For the purposes of this tutorial, let’s pretend that we’re building a simple

								grocery list app called Grublist. As a user of Grublist, you should be able to

								create, read, update, and delete grocery lists.</p>


								<p>Let’s think in terms of JSON since it’s easy and familiar.</p>


								<h3 id="single-file-collection-approach">Single-File Collection Approach</h3>


								<p>Here’s a Single-File Collection approach to modeling our data:</p>


								<div class="highlighter-rouge"><pre class="highlight"><code>// grocerylists.json

								{

								  "3255": {

								    "items": [

								      "1 Head of Lettuce",

								      "Haralson apples"

								    ]

								  },

								  // ...more lists with items

								}

								</code></pre>

								</div>


								<p>Notice that items are stored as an array nested inside of each grocery list.</p>


								<p>This is conceptually the simplest way to manage your grocery lists. It’s very

								easy to wrap your brain around what’s going on with the data. When you read the

								<code class="highlighter-rouge">/grocerylists.json</code> file, you get back exactly what you need: grocery lists and

								their items. When you write, you’re always writing to one place.</p>


								<p>There is one caveat to this approach that you should seriously consider: Every

								time you update one of your grocery lists in any way, you’re overwriting the

								entire file of all your grocery lists. This is because using the <code class="highlighter-rouge">putFile</code>

								method will overwrite anything at <code class="highlighter-rouge">/grocerylists.json</code> if it exists, so if

								you’re doing a write operation for a new grocery list, you must submit all

								previous grocery lists plus the new grocery list.</p>


								<p>That’s actually kind of scary, especially considering this code is running on

								the client where anything can go wrong. Imagine your client-side code encounters

								a parsing error with a user-input value and you overwrite two years worth of a

								user’s grocery lists with:</p>


								<div class="highlighter-rouge"><pre class="highlight"><code>"line 6: Parsing Error: Unexpected token ."

								</code></pre>

								</div>


								<p><strong>To summarize the Single-File Collection approach:</strong></p>


								<p>Pros:</p>


								<ul>

								  <li><strong>Simplified reads</strong>: Just request a single file and get back a list of all your data.</li>

								  <li><strong>Simplified writes</strong>: Some people might be more comfortable working with a Javascript array of items on the client.</li>

								</ul>


								<p>Cons:</p>


								<ul>

								  <li><strong>Pagination is impossible</strong>: Using a simple storage strategy like this, you have no choice but to download the entire file of all grocery lists. A user could have 1000 grocery lists, and every time they enter your app they would be forced to download all 1000 grocery lists worth of data.</li>

								  <li><strong>Too heavy-handed</strong>: This is the issue I mentioned above about overwriting an entire file of all grocery lists. Generally, you should try to avoid managing entire collections of data at a time.</li>

								  <li><strong>Less control over file permissions</strong>: You’ll need to perform data acrobatics if you want to share only a single grocery list with a trusted party.</li>

								</ul>


								<h3 id="multi-file-collection-approach">Multi-File Collection Approach</h3>


								<p>It would be great if we could split out grocery lists into their own files to

								minimize the risk of destroying all the user’s grocery list data and make it

								easier to paginate the lists.</p>


								<p>Here’s a diagram of a Multi-File Collection approach:</p>


								<p><img src="/images/tutorials/grocery-lists.png" style="max-width: 80%;" /></p>


								<p>With this approach, we maintain an index file that stores an array of list IDs.

								Each list ID is predictably the name of a file under a <code class="highlighter-rouge">grocerylists</code> folder.</p>


								<p><strong>To summarize the Multi-File Collection approach:</strong></p>


								<p>Pros:</p>


								<ul>

								  <li><strong>Performant pagination</strong>: Just request the <code class="highlighter-rouge">grocerylists.json</code> file, and from there you can request as many of the collection items as you need.</li>

								  <li><strong>Less risk of data corruption</strong>: By only manipulating one grocery list at a time, you can guarantee that if something goes wrong with your write operation, it won’t affect the other grocery list data. You might say, “but I still need to overwrite the list of IDs every time I operate.” That’s true, but you can optimize your code so that you’re only updating that file when you add or remove a grocery list. Managing a list of IDs is also much more manageable than a big list of user input data.</li>

								  <li><strong>More control over file permissions</strong>: If you wanted to share only a single grocery list with a trusted party, it’s much easier to do when the list data is isolated to its own file.</li>

								</ul>


								<p>Cons</p>


								<ul>

								  <li><strong>More network requests</strong>: If you have 10 grocery lists and want to fetch them all, you’re going to be making 11 network requests. Using HTTP/2 and requesting limited items at a time will help with performance.</li>

								  <li><strong>More complex architecture</strong>: Rather than simply requesting the file of all your data, you now have to request each item individually and stitch them all together once all requests have finished.</li>

								</ul>


								<h3 id="implementing-these-approaches">Implementing these approaches</h3>


								<p>We’ve shown you conceptually how you might think about organizing your data, but

								you have not seen much implementation code. Check out the sandbox linked below

								for an implementation of services that can accommodate either the single-file or

								multi-file approaches for all of your collections.</p>


								<p>Note that we’ve included an interstitial “driver” layer for a few reasons:</p>


								<ol>

								  <li>In the sandbox, we’re swapping out the Blockstack driver for a localstorage driver for demonstration purposes.</li>

								  <li>It’s good practice to have all data flow through an interface you control, so that you can add logging or perform other operations.</li>

								  <li>If the <code class="highlighter-rouge">blockstack.js</code> API changes in the future, you can update your code once in the driver.</li>

								  <li>You can DRY up your code by declaring the Gaia config once per collection.</li>

								</ol>


								<p>Click the button below to spin up a sandbox environment:</p>


								<p><a href="https://codesandbox.io/s/8kzmjjr9nj"><img src="images/tutorials/edit-sandbox.png" alt="" /></a></p>


								<h2 id="summary">Summary</h2>


								<p>There are many valid ways to organize your data and you should pick what makes

								the most sense for your needs. I would recommend using a single-file collection

								approach for predictably small collections of data. For larger collections, the

								risk of data corruption is too high to be passing around entire collections

								worth of data with one <code class="highlighter-rouge">putFile</code> request, so opt for an architecture that looks

								more like the multi-file collection model.</p>


								<p>Most importantly, feel free to experiment with data architecture. There are

								concepts and patterns you can introduce into this process that can help you

								validate schema, migrate data, and more. Check out more of our tutorials for a

								deeper dive into developing a sample app.</p>