PDA

View Full Version : OT: Any R programmers here?


MattTuck
05-01-2017, 11:47 AM
May need to learn this language, curious about experiences and advice.

thegunner
05-01-2017, 11:53 AM
i've dabbled, but for most numerical analysis stuff i've always fallen back on python with numpy.

it's easier to plugin to other things.

anecdotal at best^^ haha

abr5
05-01-2017, 12:35 PM
Haven't used R, normally just Octave (MatLab, but free!). But in a related note, if you are learning R you will probably enjoy this. Using statistics (and R) to win petty arguments!

http://webcache.googleusercontent.com/search?q=cache%3Atylermw.com%2Fsoma-water-filters-are-worthless-how-i-used-r-to-win-an-argument-with-my-wife%2F&oq=cache%3Atylermw.com%2Fsoma-water-filters-are-worthless-how-i-used-r-to-win-an-argument-with-my-wife%2F&aqs=chrome..69i57j69i58.1573j0j4&sourceid=chrome&ie=UTF-8

batman1425
05-01-2017, 12:45 PM
I can run commands that have been given to me by better users and can tweak subtile parameters for visual outputs of data. Most the people I know that are in similar situations say that if you have no programing experience, it isn't that hard to learn. It can be more challenging if you are already well versed in something else as the syntax doesn't translate exactly.

For getting started with the basics - this resource from penn state I found very helpful:

https://onlinecourses.science.psu.edu/stat484/node/203

Its free content and the videos are good. There's even practice exercises that you can do to help reinforce the foundational stuff. You may need more advanced help for the analysis tools that are specific to your application, but there is good information out there in the R code open access databases.

neusmell
05-01-2017, 01:37 PM
I feel like it depends on if you need to really *learn* how to program R or how to use R to do basic data analysis. For most people it's really just the latter, which isn't so bad.

If you need to learn how to actually program R - like to the point that you could write your own library - then I find it harder than a general programming language like Python.

The key mental trick (for me at least) is remembering that in R you tell the computer what you want done, not how to do it.

SlackMan
05-01-2017, 02:07 PM
I've used it a bit. An important question is what programming language experience do you already have? It's not the most user friendly in terms of error messages, but it's relatively straightforward once you pick up the basics.

MattTuck
05-01-2017, 02:29 PM
I have some basic programming experience, but more in the fundamentals of programming (logic, if/then, for, etc. I have taken courses in C and Java a long time ago) than actual software development.

Yes, I don't think I'd have to do actual programming, just do data analysis.

Thanks for the Penn State link. Will check it out.

Louis
05-01-2017, 02:38 PM
I've never heard of it, but if you have some experience programming and R is sort of like Matlab it shouldn't be too painful to pick up.

Aside: The Matlab "Help" function is great, and the debugger also useful. (My recommendation to anyone who needs some basic computer programs to crunch data or even do basic calculations is always to go with Matlab, if it's available.) Only if you need to crunch truly massive amounts of data does it have issues.

kgreene10
05-01-2017, 02:56 PM
I have used R for social science statistical analysis fairly extensively. I don't know any other programming languages and picked up R on an as-needed basis. To me, it's a royal PITA but has some functionality I need that other statistical analysis packages do not. Let me know what you need.

nooneline
05-01-2017, 03:06 PM
I did in grad school, but I've forgotten everything I learned... except that there was a module that set up a menu-driven GUI for the commands, so I used that instead of programming stuff myself. That was handy.

Louis
05-01-2017, 04:02 PM
Some fun with MATLAB:

x = 0:0.01:1;
figure(1)
filename = 'testnew51.gif';
for n = 1:1.0:20
y = x.^n;
plot(x,y)
drawnow
frame = getframe(1);
im = frame2im(frame);
[imind,cm] = rgb2ind(im,256);
if n == 1;
imwrite(imind,cm,filename,'gif', 'Loopcount',inf);
else
imwrite(imind,cm,filename,'gif','WriteMode','appen d');
end
end

rkhatibi
05-01-2017, 04:13 PM
No personal experience, but have coworkers who do. They recommend Python with it's assorted data and math libraries as still the easiest way to get started. Most system/ops engineers I work with want to avoid learning a domain specific language in favor of learning to do more with a general purpose language. Your priorities may vary.

I can't evaluate the information on R, but is spot on based on my experience with Python.
https://www.datacamp.com/community/tutorials/r-or-python-for-data-analysis#gs.7IY3UkY

fiamme red
05-01-2017, 04:27 PM
A friend of mine learned R with the Johns Hopkins class on Coursera, and he highly recommends it: https://www.coursera.org/learn/r-programming.

alterergo
05-01-2017, 05:14 PM
I had to learn R when I was doing MA in Stats. There is a huge userbase, and lots of nice guides written. What really helped me was to understand and focus on particular tasks that I would be using R for. Programming language is just a tool and you learn tools by using them.

batman1425
05-01-2017, 06:17 PM
The hardest thing for me as a biomedical scientist with a bench background, was using syntax to refer to very large data sets with out a GUI (excel) crutch. I had a hard time drawing a mental image in my head of what the "tables" looked like, how to reference the correct data from them, and the syntax to pull it - I had no programing or even termanl execution experience - so it was all novel to me. I can now follow instructions and with enough time - manipulate them slightly.

For all of my training my n's were 99.9% of the time less than 25 - which something like graphpad is more than enough to handle for my purposes. 25x 8GB sequencing files with terabases of DNA sequence information is a whole different animal.

sonicCows
05-01-2017, 06:39 PM
MATLAB is my bread and butter but for learning R and general data analysis this book is free and pretty good
http://www-bcf.usc.edu/~gareth/ISL/
with accompanying video lectures http://fs2.american.edu/alberto/www/analytics/ISLRLectures.html

summilux
05-01-2017, 06:45 PM
I'm the director of a graduate Bioinformatics program. R is a fairly straightforward language to learn and as batman pointed out there are plenty of free online resources to learn. Is there are reason you want to use R? My personal preference is for MatLab.

45K10
05-01-2017, 08:05 PM
Here is a link to a data carpentry workshop my wife is putting on at the Northeastern MSC
https://drk-lo.github.io/2017-05-18-NahantNUMSC/

It will cover a bunch of R stuff. It is almost at capacity. It is on Nahant so if you decide to come down bring your bike and I'll take you out for a ride.

cassa
05-01-2017, 10:09 PM
If what you need is to do some data analysis, the collection of packages called "the tidyverse" (written by Hadley Wickham and others) make it relatively easy to read in and then manipulate tabular data (adding new columns computed from others, filtering out rows, grouping, joining, etc...) and then visualize in nice pretty plots.

Here's a pretty good book (with free online version) that walks you through those kinds of uses:

R for Data Science (http://r4ds.had.co.nz/)

And download RStudio (https://www.rstudio.com/) if you want a nice development environment.

verticaldoug
05-02-2017, 03:13 AM
For all of my training my n's were 99.9% of the time less than 25 - which something like graphpad is more than enough to handle for my purposes. 25x 8GB sequencing files with terabases of DNA sequence information is a whole different animal.

Call me when your thumbdrive turns into AWS Snowmobile.

marciero
05-02-2017, 05:37 AM
R Commander is a gui-like interface, if you want to avoid command-line.
A colleague is using for an intro stats course.

marciero
05-02-2017, 05:51 AM
I'm the director of a graduate Bioinformatics program. R is a fairly straightforward language to learn and as batman pointed out there are plenty of free online resources to learn. Is there are reason you want to use R? My personal preference is for MatLab.

R is free, open source. Also, R appears to be more widely used among statisticians and data science-y types. Matlab is more prevalent among engineers, and also some scientists. My sense is that there are not many statisticians who use Matlab.

Am curious about your preference for Matlab. But bioinformatics is a very broad area, with lots of cross-over with electrical engineering, especially signal processing, where Matlab is a pretty standard tool. Do you prefer Matlab even for purely statistical analyses? Do you use the statististics and/or bioinformatics toolboxes?

batman1425
05-02-2017, 07:52 AM
Call me when your thumbdrive turns into AWS Snowmobile.

My comment wasn't meant to incite a "my data is biggest" contest, merely to point out that the interfaces that most efficiently deal with n's on the order of 10^2 are very different than 10^10-10^12, and many bench trained biologists like myself do not have much experience managing data sets that large.

Likes2ridefar
05-02-2017, 08:04 AM
I recently had to analyze a dataset consisting of about three million rows by 9 columns. It came in a bunch of .csv files that had to be stacked, cleaned, sorted, etc before doing statistical analysis. I know java already so considered it and other coding languages. I played with Python and r, but ultimately used minitab for most everything.

I don't know what you are needing to do, but I found minitab pretty easy to use.

MattTuck
05-02-2017, 08:55 AM
Thanks all, this is incredibly helpful stuff. I can't give much detail about why I'm interested, other than to do some data exploration, visualizations and analytics and predictive modeling on basic types of data. age, gender, background, work experience, etc. Possibly also some latent semantic analysis.

summilux
05-02-2017, 09:00 AM
Am curious about your preference for Matlab. But bioinformatics is a very broad area, with lots of cross-over with electrical engineering, especially signal processing, where Matlab is a pretty standard tool. Do you prefer Matlab even for purely statistical analyses? Do you use the statististics and/or bioinformatics toolboxes?

We do a fair of image processing and I prefer MatLab because of this. I'm a bit of an outlier in the bioinformatics field but I guess some of my preference comes from knowing MatLab better than R. The biostatisticians use R. One thing I dislike about R is its limited capacity for exporting high quality publication figures. You can do it, but MatLab is slicker. I have a decent budget for computation so it doesn't bother me that I have to pay. If I wanted a free program, then R for sure.

smontanaro
05-02-2017, 09:30 AM
If you're used to more well-put-together languages like Python, it won't be terribly fun. (That's my experience as a 20+ year Python programmer.) Where I work, both R and Python are used. I try and only dabble in R when necessary.

Sent from my Pixel using Tapatalk