Extreme Validation

I’ve observed an interesting trend in some new companies over the past few years. Companies like IFTTT (consolidates small pieces of functionality exposed via tons of different apis), Buffer (aggregates the creation functions available on social media apis), Coverhound (aggregates searchable insurance platforms), Zapier etc. All integrate tons of third party APIs to provide a consolidated platform. I happen to currently work on one of these integration projects. This type of integration development comes with it an interesting little problem: validation.

We integrate with many APIs that provide similar but not exactly the same data and we need to validate user input against not only our business rules, but also optionally many 3rd party rules. One of the biggest downsides to some integrations is that often third parties will only sync data once daily rather than immediately when data changes. The problem is that if the user changes some data that violates the business rules of the company that syncs once per day, by default the user will have a long delay between changing the data and when the validation fails.

One of the most common things a developer does is validate user input. Whether it’s an email, photo or file, you’ve likely got some special validations about size, dimensions or format. Let’s say that your business requires that the User address or lat/lng is present and is in a valid format. If you’re using Rails there are simple built in ActiveRecord validations or you could write a custom one off method to validate this rule.

When integrating and syncing to multiple different APIs that all have requirements specific to their business the problem gets much more interesting. Lets spice up our example and consider that we want to sync our users data to partner a: GCorp and/or partner b: LBoss.

GCorp requires the address is present and follows RFC 5774.

LBoss requires that the lat/lng is present and falls within the standard decimal (-90 to 90, and -180 to 180).

We anticipate and would love to integrate with more parties in the future, and not all users will setup syndication with both partners. So in some cases we want to be strict about requiring certain information from customers, and in other cases we’re more flexible with what we will accept.

After several iterations this is roughly the model that I came up with to solve this problem:

First: The base validation system is built out of `Validator` objects that consist of a set of `Validation`s. Each `Validation` is a callable that either returns nothing, or a `ValidationError` or an array of `ValidationError`s.

Second: Each user account has a collection of `Notification`s. The purposes for our exercise will be to display to the user a list of all the issues with their data.

Third: Signals. For each third party integration will register a set of signal handlers that fire when the important models change. In our case when the User model is saved and the address changes that will fire a signal handler that we have registered to run the User specific validators for GCorp. We will have a separate signal handler for LBoss validators.

If any of the validators fail in the signal handlers, a Notification is created so that we can flash the user with third party specific validation information.

The flow is something like this:

Users updates data for model X -> POST to our server -> Update model X -> Signal handlers for each partner run for model X -> if any validations fail `Notification`s are created -> Response includes usual 200 ok. Subsequent requests for the User account will include the associated notifications for all failed validations (we have a separate mechanism for busting this cache).

Validation
encapsulates logic for the business rule
Validator
encapsulates logic for running and collecting results of Validations
ValidationError
encapsulates data about failed validation
Notification
created when validation fails
Signal/Callback
convenient mechanism to run validations in a decoupled way allowing different rules to run depending on which third parties the user has integrated with

The biggest takeaways are: If you’re trying to validate business rules for your business in addition to integrated third parties, one approach might be to split those third party validations out into their own module, and run them in some post save / post delete  phase via either a signal, trigger, or callback. (signals are great for this in django, in rails I would consider using some of the active record callbacks).

Extreme Validation

Rails + Sitemap + Heroku + AWS

tl;dr Generate the sitemap files, push them to AWS and set up a route that redirects to those files from Rails.

While exploring google web master tools and inspecting some aspects of Insider AI SEO, I recognized a missing piece of the puzzle: sitemap! There are a few options out there for generating sitemaps for Rails, most of which generate a set of XML files and drop them in your public directory. This wont work for Insider AI as it has dynamic blog content that I want mapped so that it’s indexed by search engines. If you’ve worked much with Heroku, you know that it’s not a static file server. In fact, if you generate or attempt to store uploaded files on Heroku, they’ll get stomped out :(.

Goal: Generate dynamic sitemaps.
Problem: Heroku doesn’t play nice with generated static files.
Solution: Upload generated sitemaps to AWS.

The gem I landed on is called sitemap_generator. In the wiki on their github page there are some examples for getting up and running with Fog and CarrierWave.

These solutions were a bit heavy weight for me, so I ended up modifying this code. To eventually have a nice solution for generating sitemaps and uploading them to AWS.

Here’s everything you need to know:

1. Sign up for AWS
2. Create an IAM User (note the KEY_ID and ACCESS_KEY)
3. Create a bucket on S3 (note the bucket name as BUCKET)
4. Add a policy to the bucket to allow uploading (they have a policy generator, or you can use this overly promiscuous one)

{
	"Version": "2012-10-17",
	"Id": "Policy1",
	"Statement": [
		{
			"Sid": "Stmt1",
			"Effect": "Allow",
			"Principal": {
				"AWS": "*"
			},
			"Action": "s3:*",
			"Resource": "arn:aws:s3:::YOUR_AWS_BUCKET_NAME/*"
		}
	]
}

5. Add these gems to the Gemfile (I use figaro for key management)

# Gemfile
gem 'aws-sdk', '< 2.0'
gem 'figaro'
gem 'sitemap_generator'

7. Install figaro (creates config/application.yml and git ignores it, safety first!)

figaro install

8. Make the keys and bucket name available to the env. config/application.yml

AWS_ACCESS_KEY_ID: KEY_ID
AWS_SECRET_ACCESS_KEY: ACCESS_KEY
AWS_BUCKET: BUCKET

9. Create config/sitemap.rb to define what gets mapped

# config/sitemap.rb
SitemapGenerator::Sitemap.default_host = "https://insiderai.com"
SitemapGenerator::Sitemap.create_index = true
SitemapGenerator::Sitemap.public_path = 'public/sitemaps/'
SitemapGenerator::Sitemap.create do
  add '/welcome'
  add '/blog'
  add '/about'
  Post.find_each do |post|
    add post_path(post), lastmod: post.updated_at
  end
end

10. Create lib/tasks/sitemap.rake to define the rake task for refreshing the sitemap

require 'aws'
namespace :sitemap do
  desc 'Upload the sitemap files to S3'
  task upload_to_s3: :environment do
    s3 = AWS::S3.new(
      access_key_id: ENV['AWS_ACCESS_KEY_ID'],
      secret_access_key: ENV['AWS_SECRET_ACCESS_KEY']
    )
    bucket = s3.buckets[ENV['AWS_BUCKET']]
    Dir.entries(File.join(Rails.root, "public", "sitemaps")).each do |file_name|
      next if ['.', '..'].include? file_name
      path = "sitemaps/#{file_name}"
      file = File.join(Rails.root, "public", "sitemaps", file_name)

      begin
        object = bucket.objects[path]
        object.write(file: file)
      rescue Exception => e
        raise e
      end
      puts "Saved #{file_name} to S3"
    end
  end
end

11. Redirect requests for your sitemap to the files stored on AWS. (Needs improvement, but works)

# config/routes.rb
get "sitemap.xml.gz" => "sitemaps#sitemap", format: :xml, as: :sitemap

# app/controllers/sitemaps_controller.rb
class SitemapsController < ApplicationController
  def sitemap
    redirect_to "https://s3.amazonaws.com/#{ ENV['AWS_BUCKET'] }/sitemaps/sitemap.xml.gz"
  end
end

Hope this helps! Let me know if you get stuck somewhere and I’ll do my best to help you out 🙂

Rails + Sitemap + Heroku + AWS

Rails edge case solved with middleware

Recently I was working with a friend, @sidho on an interesting problem. Sid built this awesome app for tracking beers. It’s a Rails app that pulls some stuff from a brewery API. He setup a webhook and was receiving updates from the API. Problem was: the data posted to the webhook included the key “action” which was used to denote what type of change was happening. By default, when a request is routed, Rails sets the key “action” to the controller action name and the key “controller” to the name of the controller. I spent a little time searching, and looked at the source for about 20 min before deciding the best solution would be to some how intercept the params, rename the key action to something else and then let Rails do its thing.

Here’s our first ever Rack middleware and it’s only job is to rename an incoming param with the name “action” to “beer_db_action”.

# lib/params_fixer.rb
class ParamsFixer
  def initialize(app)
    @app = app
  end

  def call(env)
    request = Rack::Request.new(env)
    if request.params['action']
      request.update_param('beer_db_action', request.params['action'])
    end
    status, headers, resp = @app.call(env)
    [status, headers, resp]
  end
end

# config/application.rb
config.autoload_paths += Dir[&amp;quot;#{config.root}/lib/**/&amp;quot;]
config.middleware.use &amp;quot;ParamsFixer&amp;quot;

Checkout our solution if you’re interested 🙂

I’m hoping to write a pull request for the gem/make a Rails version of the gem.

Rails edge case solved with middleware

Solving Presence in Rails: Pusher vs. Node Service

There you are, sipping a mocha, writing another Rails app. It has users, and you’d love it if those users could interact and have some deep meaningful realtime connection. What do you reach for? Pusher? Generally that’s my first go to for realtime stuff! Pusher is awesome. It’s a software as a service platform for adding realtime awesomeness to your
app. As you would expect there are libraries in every flavor so you can use Pusher from whatever crazy setup you’ve got.

In the past, I’ve dropped the pusher gem into Rails apps and Wham! now I’m pushin realtime updates to my users from the server. If you’re curious how easy it is to get started with Pusher + Rails this is it:

If you want to follow along at home, you can checkout the 736e3b861705
branch of github.com/w1zeman1p/code_racer.

Add the `pusher_rails` gem to the Gemfile, then copy and paste the initializer code they give you when you create an app on their site.

# Gemfile
gem 'pusher_rails'

# config/initializers/pusher.rb
require 'pusher'
Pusher.url = &quot;http://#{ some key they give you }@api.pusherapp.com/apps/#{ some app id they give you }&quot;
Pusher.logger = Rails.logger

Initialize an instance of a `Pusher` object in javascript somewhere.
(This is also how you might setup logging).

// app/assets/javascripts/application.js
Pusher.log = function (message) {
  if (window.console &amp;&amp; window.console.log) {
    window.console.log(message);
  }
};
var pusher = new Pusher(some key they give you);

That’s it. You’re now up and running and can subscribe to events that are being pushed to the client.

Wow, that was easy. Whats the catch? Where does it fall down? Great question! (Disclaimer: You can get more out of Pusher if you pay $$, I’m interested in squeezing as much out of the free service as possible). This is great if the client is just listening for updates, but as soon as you need clients to emit events to each other, or emit events back to the server, Pusher wants you to pay. (Thanks @Phil Leggetter! The only limitation on free account is number of connections.) I completely understand, seems like a valid business model. That said, it’s surprisingly easy to get more realtime mileage if you extract that logic into a service.

Extract into a service you say…

To checkout the Node service that I extracted take a gander at:
https://github.com/w1zeman1p/code_racer_rt (the juicy stuff is in
lib/rooms.js)

Node.js is a great platform for and has many tools surrounding realtime communication [see socket.io and peer.js]. IMO, it’s got a beautiful evented architecture and great tools for managing tiny requests and streams efficiently.

Following the docs on socket.io was a great start to getting things running locally. Moving the client side logic to the javascript served from my Rails app and replacing all the client side references to localhost to with my heroku domain running the node app was enough to get going.

So what to change in Rails? I was able to strip out every single reference to Pusher, including the gem and initializer :). And then add a few lines requiring the socket.io-client library.

For me, the trick to getting everything to play nicely was setting up the socket.io connection from the client, then as soon as the first Rails page loaded, emitting a `register` event storing a hash of users by socket id in node. Essentially syncing the sessions.

One beauty of using socket.io is that you get presence (who’s online) just by storing this hash of users.

How did I arrive at this solution? Why move away from pusher and into Node? What was the smell/thing to look for that pushed me to make this huge change? Another great question! My presence implementation started to feel, um, hella hacky. Lets look at some code:

At some point I decided that users should be able to see who else is online (presence). I thought of a few ways to accomplish this, the first of which was to emit a `hello` event to all other people in the channel when the page loads. (I think you can do this with paid Pusher, I’m essentially building a toy, so that wasn’t an option).

Okay, option 2: I’ll send an xhr request when the page loads and post to a Rails endpoint (I called it /api/online_users). This was a pretty cool, but fragile solution. Here’s a couple commits with most of the code: w1zeman1p/code_racer/commit/f5e6abee69
w1zeman1p/code_racer/commit/c84e731e0423

The gist is that on document ready, send a POST, on before unload send a DELETE. Then have all clients bind to a channel called `presence` and when an OnlineUser resource is created, trigger that event and notify all users.

// on document ready
$.ajax({
  url: '/api/online_user',
  type: 'POST',
  data: window.CURRENT_RACER
});

// cleanup stuff
function cleanup() {
  CodeRacer.pusher.disconnect();
  $.ajax({
    url: '/api/online_user',
    type: 'DELETE',
  });
}
$(window).on('beforeunload', function () {
  var x = cleanup();
  return x;
});

// bind all users to presence channel, and listen for add_user
CodeRacer.pusher = new Pusher(key);
CodeRacer.presence = CodeRacer.pusher.subscribe('presence');
CodeRacer.presence.bind('add_user', function (data) {
  console.log('User coming online:', data);
});

From the Rails side, one option is to store all these users that are online in the SQL database. I didn’t go down that path for fear that talking to the SQL db would be too slow (I didn’t do any perf testing here, might be worth a try).

I tried using the Rails cache, in production I used memcachier. This worked pretty well, until some users `beforeunload` DELETE never fired and they ended up sticking around. More code?

# app/controllers/api/online_users_controller.rb
before_action :get_users
after_action :set_users

def create
  @users &lt;&lt; user_hash unless @users.include?(user_hash)
  Pusher['presence'].trigger('add_user', user_hash)
  render json: @users
end

def user_hash
  {
    id: current_user.id,
    nickname: current_user.nickname
  }
end

def get_users
  @users ||= Rails.cache.read('users') || Set.new
end

def set_users
  Rails.cache.write('users', @users.to_a)
end
# ...

Kinda hacky? Yeah, I thought so too. It all depends on the `beforeunload` event firing just right and actually completing the DELETE request perfectly to remove the user from the cache. I suppose I could poll… ew. gross. No thanks.

Option 3! Replace the online user resource completely with a node service. This was the winner. No Rails controller (talk about skinny ;)), No Rails cache, No $.ajax requests, all socket.io.

Here’s a jumpstart for getting some node code running socket.io and doing presence with `register` and `online_users` events. The idea here is that we’ll emit a `register` event from the client when the page loads, and we’ll listen for an `online_users` event for batch updates about who’s online (this could probably be more efficient if we just listened for add and remove rather than batch updating?).

// Node application running in a seprate instance than the Rails server.
// app.js
var http = require('http'),
  static = require('node-static'),
  file = new static.Server('./public'),
  _ = require('lodash');

var server = http.createServer(function (req, res) {
  req.addListener('end', function () {
    file.serve(req, res);
  }).resume();
});

// process.env.PORT is for heroku 🙂
server.listen(process.env.PORT || 8000);
var io = require('socket.io')(server);
var users = {};

io.on('connection', function (socket) {
  socket.on('register', function (data) {
    users[socket.id] = data;
    io.sockets.emit('online_users', _.values(users));
  });

  socket.on('disconnect', function () {
    delete users[socket.id];
    io.sockets.emit('online_users', _.values(users));
  });
});

Now that we’ve fired up a node app and we’re listening for connections, lets see the code we’ll need from the client.

// app/assets/javascripts/application.js
var socket = io('http://mynodeapp.herokuapp.com');
socket.on('online_users', function (data) {
  console.log('online users: ', data);
});
// on document ready
socket.emit('register', window.CURRENT_RACER);

Where can we go from here? What incredible powers does this give us? Peer to peer! A feature I’d love to add is voice/video of the racers, see that look of focus and determination…

as the type as fast as possible :). Seems like a pretty reasonable feature to add with peer.js using Web RTC.

Solving Presence in Rails: Pusher vs. Node Service

App Landing Page for Ionic app

We’ve all been to sites that explain features of an app and entice you to visit the market place and download. If you’ve developed an app and are starting to build out a landing page, you might have googled “app landing page” and found some themes from themeforest.net. These are flashy designs that display the features of the app and expect the creator to drop in app screenshots that my slide around or transition as you move between features.

Something great about apps built with ionic framework is that you can demo the nearly full (minus device features) app right on your marketing page (not sure this is cool, legally). I’ve done that for a couple ionic apps: Pushbit and Insider AI and want to show you how.

These two pages are both being served from Rails apps, but you shouldn’t have any difficulty getting things working from your own server.

The key here is to run the ionic app in an iframe.

In Rails, there is a /public directory that contains static html. In /public I’ll create a directory called app and copy the contents of www to app.

In the html for the page, I’ll put the image of the device as the background, then an iframe pointing to the app’s index.html. The iframe should be in an iframe tag, but to comply with the blog formatting I’ve omitted the tag brackets.

<div id="phone">
  <img src="/assets/iphone6.png">
  iframe src="/app/index.html" frameBorder="0" class="screens" /iframe
</div>
App Landing Page for Ionic app