Skip to content

Instantly share code, notes, and snippets.

@tannerlinsley
Last active June 4, 2021 15:07
Show Gist options
  • Save tannerlinsley/703de608515ea153116bf6a97ac8c4a5 to your computer and use it in GitHub Desktop.
Save tannerlinsley/703de608515ea153116bf6a97ac8c4a5 to your computer and use it in GitHub Desktop.
An RFC to rewrite React Query's API to allow for opt-in normalization

React Query v4 (Ezil Amron) RFC

Preface: This is an RFC, which means the concepts outlined here are a work in progress. You are reading this RFC because we need your help to discover edge cases, ask the right questions and offer feedback on how this RFC could improve.

What are we missing today without normalization?

Today, React Query uses unstructured query keys to uniquely identify queries in your application. This essentially means RQ behaves like a big key-value store and uses your query keys as the primary keys. While this makes things conceptually simple to implement and reason about on the surface, it does make other optimizations difficult (and a few others impossible):

  • Finding and using Initial/placeholder single-item queries from list-like queries. While this is possible today, it's very manual, tedious, and prone to error.
  • Manual optimistic updates that span multiple query types and shapes (eg. single-item queries, list queries and infinitely-paginated queries) are tedious and prone to breakage as your app changes.
  • The corresponding rollbacks of said app-wide optimistic updates are brittle, manual, and error prone as you must implement this logic yourself for every mutation
  • Auto-invalidation is near impossible since there is no schema backing queries and mutations are unaware of queries and their keys/data. Thus most, if not all, mutations usually result in manually calling invalidateQueries at the very least.
  • RQ's cache is currently a key-value store, which means it has a larger potention for taking up more memory if many queries are made to store the same piece of data in different lists or locations.

What could React Query look like with an opt-in normalization-first API?

Queries would take on a new API based on query definitions and optional resource identification. Those APIs might look like this:

const todoListQuery = createListQuery({
  kinds: ['todo'],
  fetch: async (variables) => {
    const { data } = await axios.get('/todos', { params: variables })
    return data
  },
  getResources: (todos) =>
    todos.map((todo) => ({
      kind: 'todo',
      id: todo.id,
      data: todo,
    })),
})

const todoQuery = createQuery({
  kind: 'todo',
  getId: (variables) => variables.todoId,
  fetch: async (variables) => {
    const data = await api.get(`/todos/${variables.todoId}`)
    return data
  },
})

What is an resource?

"Resource" is a flexible term, which is why we chose it. In the context of React Query, it represents a fine-grained single item or object that you would normally request from a server. An individual todo or user is a good example, but not a list of todos or users.

What is resource identification?

Resource identification is the process of taking arbitrary data we receive from the server and extracting the information out of it that uniquely identifies it. Take the following todo object for example:

const todo = {
  id: '93jhft2of8fy3j',
  created: Date.now(),
  title: 'Do the dishes',
  isComplete: false,
  notes: `They're really piling up...`,
}

We could easily identify this resource like so:

const resource = {
  kind: 'todo',
  id: todo.id,
  data: todo,
}

This gives us a data structure that is consistent and reliable in helping us normalize many resources with different kinds and ids into a normalized database.

What are queries?

Queries are used to define dependencies on one more resources from the server. There are 2 types of query definitions:

  • List Queries (includes paginated/infinite queries as well)
  • Resource Queries

List Queries

This is what a list query could look like:

const todoListQuery = createListQuery({
  kinds: ['todo'],
  fetch: async (variables) => {
    const { data } = await axios.get('/todos', { params: variables })
    return data
  },
  getResources: (todos) =>
    todos.map((todo) => ({
      kind: 'todo',
      id: todo.id,
      data: todo,
    })),
})

Another approach could be to expose an resource identification function:

const todoListQuery = createListQuery({
  kinds: ['todo'],
  fetch: ({ resource }) => async (variables) => {
    const { data: todos } = await axios.get('/todos', { params: variables })

    return todos.map((todo) => {
      resource({
        kind: 'todo',
        id: todo.id,
        data: todo,
      })
    })
  },
})

A list query is used to define a group of resources that are fetched and synchronized from the server together. For example:

  • An array of all todos for a given user
  • An array of of todos that have been filtered to a specific search term
  • An array of notification objects
  • An array of users

Did you notice how many times we used array there? ๐Ÿ˜‰

Let's go over some of the basic options of list queries:

  • kinds: string[] - This is an array of strings where each string defines the kinds of resources this query might contain. While a vast majority of queries will likely only contain a single kind of resource, it's possible for a list query to return resources of different kinds, hence the array.

    createListQuery({
      kinds: ['todo'],
      // or
      kinds: ['user', 'bot'],
    })
  • fetch: (meta) => Promise<Data> - Similar to the query function in the current version of React Query, this function should return a promise that resolves your data from the server.

    createListQuery({
      fetch: async () => {
        const data = await api.getTodos()
        return data
      },
    })
  • getResources: (data) => Resource[] - A function that is used to identify the resources received from the fetcher function. In the example below, it receives the data from our fetcher function and returns an array of resources.

    createListQuery({
      getResources: (todos) =>
        todos.map((todo) => ({
          kind: 'todo',
          id: todo.id,
          data: todo,
        })),
    })

What about returning different kinds of resources in list queries?

If we wanted to return multiple resource types from the server, our getResources function could look like this:

createListQuery({
  getResources: (items) =>
    items.map((item) => ({
      kind: item.kind,
      id: item.id,
      data: item,
    })),
})

What about normalizing nested resources in list queries?

To register nested resources, you could recurse on resources themselves to collect more sub resources:

createListQuery({
  getResources: (users) =>
    users.map((user) => ({
      kind: 'user',
      id: users.id,
      data: user,
      subResources: {
        todos: (todos) =>
          todos.map((todo) => ({
            kind: 'todo',
            id: todo.id,
            data: todo,
          })),
      },
    })),
})

Resource Queries

Let's look one more time at what a list query could look like:

const todoQuery = createQuery({
  kind: 'todo',
  getId: (variables) => variables.todoId,
  fetch: async (variables) => {
    const data = await api.get(`/todos/${variables.todoId}`)
    return data
  },
})

An resource query is used to define single resources that are fetched and synchronized from the server. For example:

  • Individual todo objects
  • Individual notification objects
  • Individual user objects
  • Individual github repositories
  • Individual results that contain an array of table rows or dates to plot on a chart

Wait, did you just use array there? I thought that was for lists?

The data visualization example above is meant to illustrate that our "list" and "single" query APIs are more conceptual instead of actual rigid classification structures. It is very common for a data visualization endpoint to return an array of table rows or dates to plot on a chart, but you would rarely want to normalize each individual row of data or date's data. It makes more sense to treat the result as an individual resource.

Let's go over some of the basic options of resource queries:

  • kind: string - This is the kind of the individual resources that this query returns. This kind should match up with the kinds array in any list queries, so they can be aware of each other.

    createQuery({
      kind: 'todo',
      // or
      kinds: 'user',
    })
  • getId: (variables) => variables.todoId - This function is how we uniquely identify our resource requests before and during fetching. You are passed the variables for the query and can return the id for what you are requesting.

  • fetch: (meta) => Promise<Data> - Similar to the query function in the current version of React Query, this function should return a promise that resolves your data from the server. In the example below, we're using the variables to pass our todoId to the server

    createQuery({
      fetch: async (variables) => {
        const data = await api.get(`/todos/${variables.todoId}`)
        return data
      },
    })

Hold on, where is getResources for resource queries?

Since we already know the kind and the id for an resource query before we even request them, we can automatically idresource the resource. The above example would result in an resource like:

createQuery({
  kind: 'todo',
  id: variables.todoId,
  data: todo,
})

New query API, new capabilities

Even without discussing mutations (yet ๐Ÿ˜‰), this new query-driven API that is designed for opt-in normalization would allow React Query to gather structured information about your server dependencies and perform out-of-the-box, app-wide, automatic optimizations such as:

  • Automatic initial/placeholder data - It's just magically there.
  • Automatic updates to list queries from resource query data.
  • Better memory management by sharing resources across queries
  • One-touch manual updates to the cache, eg. You can make a single call to update an resource by kind and id and have it reflected across all list queries and resource queries (as opposed to manually iterating over all queries and searching/updating all of the different query types like list, paginated, single, etc)

Normalized Mutations

In React Query today, mutations not much more than a wrapper around some tracked state variables and asynchronous lifecycle callbacks. They currently make it pretty convenient to call invalidateQueries or do optimistic updates, but the brutal fact is that we still have to do that oureselves for every single mutation.

Mutations can be so much more with normalization baked into the API:

const createTodoMutation = createMutation({
  action: 'create',
  mutate: async (newTodo) => {
    const { data } = await axios.post(`/todos`, newTodo)
    return data
  },
  getOptimisticResources: (optimisticTodo) => {
    const tempId = uuid()

    return [
      {
        kind: 'todo',
        id: tempId,
        data: { ...optimisticTodo, id: tempId },
      },
    ]
  },
  getResources: (newTodo, optimisticResources) => [
    {
      kind: 'todo',
      id: newTodo.id,
      data: newTodo,
      replaceId: optimisticResources[0].id,
    },
  ],
})

const updateTodoMutation = createMutation({
  action: 'update',
  mutate: async (todo) => {
    const { data } = await axios.put(`/todos/${todo.id}`, todo)
    return data
  },
  getOptimisticResources: (todo) => [
    {
      kind: 'todo',
      id: todo.id,
      data: todo,
    },
  ],
  getResources: (todo) => [
    {
      kind: 'todo',
      id: todo.id,
      data: todo,
    },
  ],
})

const removeTodoMutation = createMutation({
  action: 'remove',
  mutate: (todoId) => api.removeTodoById(todoId),
  getOptimisticResources: (todoId) => [{ kind: 'todo', id: todoId }],
})

Alright, so what's are these action and optimisticResources options?

Mutations, as you saw above are very similar in spirit to queries and use the same "resource"-esque vocabulary. However, there are a few cool options that make them very powerful:

  • action?: 'create' | 'update' | 'remove' - The action type of a mutation denotes what the mutation is doing with the resources it's handling. This action determins how optimistic updates behave, or whether to perform them at all. You're also not required to pass an action if you are simply firing off an RPC or utility call that doesn't affect any resources.
  • getOptimisticResources: (variables) => Resource[] - As you might have guessed, this function is responsible for returning optimistic informatation about resources. For create actions, the optimistic resources are added, for update actions they are replaced, and for remove actions, they are removed.

Along with mutations and optimistic updates, list queries could also have options like:

  • optimistic: boolean | ['create', 'update', 'remove'] - Whether the list query should respond to none, all or some optimistic update actions
  • createMode: 'append' : 'prepend' - A quick way to determine whether new resources should be pushed or unshifted onto list queries
  • optimisticCreate/optimisticUpdate/optimisticRemove: (existingResources, optimisticResources) => newResources - Fucntions to manually override how optimistic actions and their resources are handled. Depending on the action, you could manually append, prepend, replace, or remove resources.

New mutation API, even more new capabilities

With our new structured information about mutations and their dependencies we can make even more incredible optimizations like:

  • Automatic optimistic updates across all list queries and resource queries
  • Automatic rollbacks for optimistic updates
  • Auto-invalidations across all list queries and resource queries after successful mutations

Okay, but what if I don't need normalization? React Query was great at being "simple"...

You're right! That's why this new API is designed to be an easy opt-in for normalization, but definitely not a requirement. Take the following query for example:

const todoQuery = createQuery({
  fetch: async () => {
    const { data } = axios.get(
      'https://api.github.com/repos/tannerlinsley/react-query'
    )
    return data
  },
})

Because queries are not defined in hooks, but in the module scope, the query instance itself and/or function idresource can be used to uniquely identify the query and perform deduping, etc. Which brings us to another great question...

How are queries, requests uniquely identified?

Queries could be uniquely identified by:

  • Query instance
  • fetch function
  • kinds or kind

Individual requests for queries could be uniquely identified further by:

  • Stringified variables
  • Derived resource ID from variables

How could we consume these queries?

With a new and improved useQuery hook!

function Todos({ todoId }) {
  const todosQuery = useQuery({
    query: TodosQuery,
  })
}

function Todo({ todoId }) {
  const todoQuery = useQuery({
    query: TodoQuery,
    variables: { todoId },
  })
}

What happens to the rest of the awesome stuff I'm use to in React Query v3?

It stays! Yes, I'm talking about keeping 99% of what you know and love today in React Query, including (but not limited to)

  • stale/cache timings
  • request deduping
  • polling
  • dependent queries
  • parallel queries
  • pagination/lagged queries

So what's really changing?

  • No longer defining queries on the fly in useQuery(), but instead defining queries ahead of time which can power our subscriptions via hooks like useQuery()
  • No more unstructured query keys. Instead, kind, id and variables would be used to uniquely identify a query
  • No longer required to implement your own:
    • Initial Data / Placeholder Data
    • Optimistic Updates / Rollbacks
    • Mutation-related query invalidations

WIP Proof of concept and inspiration

https://codesandbox.io/s/focused-moser-gu66b?file=/src/App.js

@Stancobridge
Copy link

Won't break v3 code when upgraded ??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment