Comments None

According to Quick-R, logistic regression or logit is a regression model you do when you are predicting a binary outcome from a set of continuous predictor variables.In my case, I was studying the effect of several continuous variables on a binary output: to have skin wounds.

I used the glm() function (glm stands for General Linear Model), wounds is the binary output (1- wound; 0- no wounds), and several continuous variables (va1…var12).

These were my commands:

fit <- glm(wounds~var1+var2+var3, family = 'binomial')

Then, a bunch of commands may be useful to extract coefficients and values from fit:

summary(fit) # display results
confint(fit) # 95% CI for the coefficients
exp(coef(fit)) # exponentiated coefficients
exp(confint(fit)) # 95% CI for exponentiated coefficients

However, once I have studied the data, they can be displayed in a very nice way:

col_1 <- round(exp(cbind(OR=coef(fit), confint(fit))), digits = 2)
col_2 <- round(coef(summary(fit))[,4], digits = 2)
final_table <- cbind(col1, p.value=col2)

By the way, this command:


extracts the p-values.

The outcome is a table like this:

                                      OR      2.5 %         97.5 %    p.value
(Intercept)                            0.01     0.00           0.09        0.06
var1                                   0.84     0.26           0.27        0.77
var2                                   1.12     1.02           1.25        0.03
var3                                   0.91     0.69           1.19        0.48
var4                                   4.13     0.78           2.76        0.11

And that’s all.


Comments None

Two days ago, I was in a hurry, and I didn’t want to search on how to import tables rendered with R into LibreOffice. I came across several pages and tutorials explaining how to install packages to import tables into Word. Since I work with GNU/Linux, I use LibreOffice, and I’d rather prefer not to install more packages in my R install.

The tutorial proposed by G-Force was to export the table to a nice HTML format, import it from LibreOffice, and then copy-paste it to Word. Maybe too complicated, isn’t it?

Ok, I found a simple way, which actually can be used with either Word or LibreOffice.

After I built a table in R using rbind() and cbind(), colnames() and rownames(), the next command I typed was:

write.csv(my_table, file="my_table.csv" )

As you can see, it’s a very easy way to output a table. It only generates a .CSV file.

First of all, open the .CSV file with a text editor and copy-paste the content into LibreOffice.

Then, select the text in LibreOffice and go to Table → Convert → Text to Table. In the pop-up window that appears, → Separate text → Others, and the select the comma.

The whole process is similar in Word. From there, you can format your table as yo want.

Then, from LibreOffice,


Comments None

Recently I came across (and it’s not the first time it happened) with a usual problem while I’m working with R; to use some values from any function’s output. Sometimes it’s useful just to include the p-value into a table, or the Odds Ratio, or something like that.

Almost any R function has an output, which is a vector of some values. While some functions such as summary() have a lot of values that can be accessed using coef(), sometimes I find more useful to use names(). Let’s see.

chisq.test(therapy, wound)
	Pearson's Chi-squared test with Yates' continuity correction
data:  pacientes$tratamiento and pacientes$ulcera
X-squared = 0, df = 1, p-value = 0.02

I just want to access the p-value, wchich is 0.02. The solution is:

names(chisq.test(therapy, wound))
[1] "statistic" "parameter" "p.value"   "method"    
[5] "" "observed"  "expected"  "residuals"
[9] "stdres" 

So, I can access the p-value by either command:

chisq.test(therapy, wound)$p.value
chisq.test(therapy, wound)[3]

The former command is what I was looking for. The latter command also yields the name of the value, that is, p-value.


Comments None

Now that I am drafting my dissertation, an issue has just come up. How to get nice, quality graphics in LaTeX? Previously, I used to get .PNG images, but then I realised they are no suitable for publication, as scientific journals ask for vectorial plots. The easier way to get quality graphics is getting a .PDF rather than a .PNG image.

In order to get one of this in R:

# function_whatever()
# plot(dataset)

In LaTeX:


The outcome is quite good, but I want extra goods: why not using Latex typesetting in my plots? Look at this sample:

How did I do it? Probably the best workaround I found is in Freakonometrics. We have two choices: using a .TEX file or using a .PDF file. Let’s see how to do it.


The tikzDevice converts graphics produced in R to code that can be interpreted by the LaTeX package tikz. TikZ provides a very nice vector drawing system for LaTeX.

First of all, we must install the package:

# install the package
# then, use the library

Rendering .TEX plots in R

The first approach is to render a .TEX file that could be included within the main .TEX file:

hist(Epi.creat, main = 'Sample plot' )

Then, in LaTeX:


The .TEX file would be a very simple text file, with no \begin{document} statement.

Rendering .PDF plots in R

Alternatively, we can render a .PDF file:

tikz('my_plot.tex', standAlone = TRUE )
hist(Epi.creat, main = 'Sample plot' )

This .TEX file would be a complete file that might be rendered via pdflatex. The last command tries to transform the .TEX file into a .PDF file which can be included as an image in a LaTeX file.

From Stackoverflow, the advantages of tikz() function as compared to pdf() are:

  • The font of labels and captions in your figures always matches the font used in your LaTeX document. This provides a unified look to your document.
  • You have all the power of the LaTeX typesetter available for creating mathematical annotation and can use arbitrary LaTeX code in your figure text.
  • Using tikz will give you a look consistent with the rest of your document, besides it will use LaTeX to typeset all the text in your graphs.

Another source: tikzdevice-demo


Comments None

When trying to analyse data, it’s a good practice to check if our data are valid or not. I have a dataset in CSV format, and I have to explore them to know what I have. It’s been a long, tedious work to compute a proper SQL sentence that included and selected my date, and I have to make sure such data have quality: not too many missing values, and not too many outlier.

What’s an outlier? It’s an anomalous value with respect to the rest of the data, and it’s too distant to the other values. Any statistical analysis would not be reliable if I don’t check for outliers.

Let’s have a look to my own experience. After great efforts, I could release my CSV file. First thing I do:

patients <- read.table("research.csv", header=TRUE, sep=",")

What I’ve found was this:

There are some variables that have wrong values. For instance, peso (weight): it’s quite hard someone whose weight is 670 Kg. My first attempt to get rid of these values was:

max(peso)  # it gives the maximum value, but not where it is
which.max(peso)  # yeah, it gives me the 'index', so I can find it
[] 12

The problem is that I can find these values one by one. It’s not practical. But I can find them without having to create any further function:

[1]  12   219  386   688   1209   1729   2254

Now, I know the index of each value, using certain criteria (weight > 150 Kg).

Finally, it’s up to me to decide what to do with these values: fix them? Delete them? Well, it depends. If I decide it’s only a typesetting mistake, I can fix it. If I’m unable to find the correct value, it’s better to delete the whole row.

When researching, one of the most important things to remember: be honest. Do not make a fake analysis.


← Older Newer →