The trends are clear. The last couple of years have seen an unprecedented interest in big data analysis.
The reasons for this are simple. With ever expanding capacities and (Thai floods not withstanding) plummeting costs for hard drives, the amount of data that can be cheaply stored has exploded. Combined with cheap automated collection methods, many fields have seen orders of magnitude increases in data volume that make a mockery of any hyperbole you can throw at them.
This increase in data volumes can improve the power of your statistical tests, and even allow new types of analyses that weren’t previously possible. It’s a wonderful thing for statisticians and data scientists. The downside is that to deal with big datasets you often need to learn a whole new set of skills. That time spent worrying about hardware, or trying to be a network administrator is time you could have spent analysing data. At Live Analytics we believe that you should be allowed to get on with data analysis, without worrying about these other distractions. As well as avoiding the hassle of software and package installation, Live-R has a few tricks up its sleeves to help you with big analyses.
Firstly, the architecture that Live-R is built upon is highly scalable. If you need more memory, Live-R will allocate more memory to you. You no longer need to go to the shops and buy another stick of RAM.
However, by far the most useful feature of Live-R to assist is big analyses is the ability to run tasks in the background, letting you get on with other work while your job executes, or even better, going to have a cup of tea. Here’s how it works.
First you write a script to implement your analysis. I suggest testing it on a small dataset to make sure it works correctly, before you run the full thing. In this case, we are just generating a large matrix and calculating the QR factorisation of it. Note that the script needs to include a way of saving the outputs: either printing to screen, or writing to file (in this case we call save to save the variable). Now click File -> “Execute script in background”.
Secondly, choose your script file, either from your cloud-hosted files within Live-R, or from your own machine.
Live-R gives you a message to say that it is running, and where to find the output. Click OK and have that cup of tea.
When it’s done, two files showing the console output and error output are created, along with the variables we saved. It’s as easy as that.









