Posted
Comments None

Now that I am drafting my dissertation, an issue has just come up. How to get nice, quality graphics in LaTeX? Previously, I used to get .PNG images, but then I realised they are no suitable for publication, as scientific journals ask for vectorial plots. The easier way to get quality graphics is getting a .PDF rather than a .PNG image.

In order to get one of this in R:

png('my_plot.png')
# function_whatever()
# plot(dataset)
dev.off()

In LaTeX:

\includegraphics[scale=0.8]{my_plot.png}

The outcome is quite good, but I want extra goods: why not using Latex typesetting in my plots? Look at this sample:

How did I do it? Probably the best workaround I found is in Freakonometrics. We have two choices: using a .TEX file or using a .PDF file. Let’s see how to do it.

Installation

The tikzDevice converts graphics produced in R to code that can be interpreted by the LaTeX package tikz. TikZ provides a very nice vector drawing system for LaTeX.

First of all, we must install the package:

# install the package
install.packages("tikzDevice")
# then, use the library
library(tikzDevice)

Rendering .TEX plots in R

The first approach is to render a .TEX file that could be included within the main .TEX file:

tikz('my_plot.tex')
hist(Epi.creat, main = 'Sample plot' )
dev.off()

Then, in LaTeX:

\include{my_plot.tex}

The .TEX file would be a very simple text file, with no \begin{document} statement.

Rendering .PDF plots in R

Alternatively, we can render a .PDF file:

tikz('my_plot.tex', standAlone = TRUE )
hist(Epi.creat, main = 'Sample plot' )
dev.off()
tools::texi2dvi('my_plot.tex',pdf=T)

This .TEX file would be a complete file that might be rendered via pdflatex. The last command tries to transform the .TEX file into a .PDF file which can be included as an image in a LaTeX file.

From Stackoverflow, the advantages of tikz() function as compared to pdf() are:

  • The font of labels and captions in your figures always matches the font used in your LaTeX document. This provides a unified look to your document.
  • You have all the power of the LaTeX typesetter available for creating mathematical annotation and can use arbitrary LaTeX code in your figure text.
  • Using tikz will give you a look consistent with the rest of your document, besides it will use LaTeX to typeset all the text in your graphs.

Another source: tikzdevice-demo

Author

Posted
Comments None

When trying to analyse data, it’s a good practice to check if our data are valid or not. I have a dataset in CSV format, and I have to explore them to know what I have. It’s been a long, tedious work to compute a proper SQL sentence that included and selected my date, and I have to make sure such data have quality: not too many missing values, and not too many outlier.

What’s an outlier? It’s an anomalous value with respect to the rest of the data, and it’s too distant to the other values. Any statistical analysis would not be reliable if I don’t check for outliers.

Let’s have a look to my own experience. After great efforts, I could release my CSV file. First thing I do:

patients <- read.table("research.csv", header=TRUE, sep=",")
attach(pacients)
sumary(patients)

What I’ve found was this:

There are some variables that have wrong values. For instance, peso (weight): it’s quite hard someone whose weight is 670 Kg. My first attempt to get rid of these values was:

max(peso)  # it gives the maximum value, but not where it is
[1]670
which.max(peso)  # yeah, it gives me the 'index', so I can find it
[] 12


The problem is that I can find these values one by one. It’s not practical. But I can find them without having to create any further function:

which(peso>150)
[1]  12   219  386   688   1209   1729   2254


Now, I know the index of each value, using certain criteria (weight > 150 Kg).

Finally, it’s up to me to decide what to do with these values: fix them? Delete them? Well, it depends. If I decide it’s only a typesetting mistake, I can fix it. If I’m unable to find the correct value, it’s better to delete the whole row.

When researching, one of the most important things to remember: be honest. Do not make a fake analysis.

Author

Posted
Comments None

I got used to developing web applications with PHP and MySQL. I thought the only right way to use a MySQL connection was through PHP, Ruby on Rails or even Python (Django), that is, using a web application. I didn’t need another method of connecting to a MySQL server, apart from managers like PHPMyAdmin. However, since I’m currently doing my PhD, I need to dig into tables of a MySQL database in order to extract certain data. I began using bash in GNU/Debian, but I realised I needed a more versatile way, so I thought Python could help me.

There are several methods on how to connect to a MySQL server using Python. Finally, I came across the two preferred methods: MySQLdb y mysql connector.

As stated in Stack Overflow, MySQLdb is a C module that links against the MySQL protocol implementation in the libmysqlclient library. It’s faster but requires the library in order to work. However,
mysql-connector is a Python module that reimplements the MySQL protocol in Python. It is slower, but does not require the C library and so is more portable.

There’s a nice, interesting comparative at Charles Nagy. Probably, the speed differences are veery small, keeping in mind my intended purposes (the client and the server are on the same computer), so I chose MySQLdb. To install it in Debian:

sudo apt-get install python-mysqldb

The steps are:
  1. Open the connection
  2. Open the cursor
  3. Write (and execute) the query
  4. Commit the query if necessary
  5. Retrieve data if necessary
  6. Close the cursor
  7. Close the connection

Right extracted from kitebird.com, a little example:

#!/usr/bin/python
import MySQLdb
conn = MySQLdb.connect (
    host = "localhost", 
    user = "testuser", 
    passwd =  "testpass", 
    db = "test")
cursor = conn.cursor ()
cursor.execute ("SELECT VERSION()")
row = cursor.fetchone ()
print "server version:", row[0]
cursor.close ()
conn.close ()

This little script just shows the version of the MySQL server. I’ll write a new post with SELECT statement. See you then.

Author
Categories ,

Posted
Comments None

It’s been long since I wrote for the last time in this blog. I don’t know exactly why I give up writing, for I like reading and writing in English. Somewhat it’s exciting.

I started this blog several years ago when I was very involved in the development of web applications, PHP programming and CodeIgniter. I keep on programming, of course, but nowadays I’m interested in other topics, such R, Latex, medical articles publishing and my own thesis.

Another interest is English. I love English. I decided several months ago that I will try to improve my English skills. That’s why I will write all my post in English.

My blog’s last posts were written in English. I realize it’s a very poor English, but I’m trying to do my best.

I will devote this blog to development issues: R, Latex, maybe CodeIgniter. I will write my thoughts in my other blog (same CMS, same theme). I think it could be a good idea to keep them separated. Let’s see.

I imported all posts and old stuff from my old blog. From now on, it’s officially dead. Don’t be surprised if you see some posts in Spanish.

Again, welcome to my site reborn.

Author
Categories

Posted
Comments None

Up to now, the only situation I’ve used the URL arguments in CodeIgniter was in the Pagination Class. But there are more uses of this, as you can retrieve information from your URI strings. Although I’ve been using CodeIgniter for 2 years, I’ve never had to use the $this->uri->segment(n).

Let’s see an example. Let’s type this URL:

http://localhost/index.php/site/main/parameter_1

The segment numbers would be:
  1. site
  2. main
  3. parameter_1

It’s not necessary to know how the segments work in order to access the parameters, as CodeIgniter knows which is the controller (site), which is the method (main) and which is the parameter (obviously, parameter_1).

But I came across the following problem: What happens if I have got a 4^th^ parameter I have to use? Let’s say I have:

http://localhost/index.php/site/main/parameter_1/parameter_2

The segment numbers would be:
  1. site
  2. main
  3. parameter_1
  4. parameter_2

When I worked with PHP, the solution was the use of GET, by using $_GET[‘parameter’], but CodeIgniter is smarter.

You can access this way:

$site = $this->uri->segment(1);
$main = $this->uri->segment(2);
$parameter_1 = $this->uri->segment(3);
$parameter_2 = $this->uri->segment(4);

So you can easily retrieve information from your URL without using GET:
$this->uri->segment(1); // controller
$this->uri->segment(2); // method
$this->uri->segment(3); // 1st segment
$this->uri->segment(4); // 2nd segment

I hope it’s useful for anyone. Bye!
More information: Stackoverflow, and URI Clas – CodeIgniter

Author
Categories ,

← Older Newer →