Comments None

When trying to analyse data, it’s a good practice to check if our data are valid or not. I have a dataset in CSV format, and I have to explore them to know what I have. It’s been a long, tedious work to compute a proper SQL sentence that included and selected my date, and I have to make sure such data have quality: not too many missing values, and not too many outlier.

What’s an outlier? It’s an anomalous value with respect to the rest of the data, and it’s too distant to the other values. Any statistical analysis would not be reliable if I don’t check for outliers.

Let’s have a look to my own experience. After great efforts, I could release my CSV file. First thing I do:

patients <- read.table("research.csv", header=TRUE, sep=",")

What I’ve found was this:

There are some variables that have wrong values. For instance, peso (weight): it’s quite hard someone whose weight is 670 Kg. My first attempt to get rid of these values was:

max(peso)  # it gives the maximum value, but not where it is
which.max(peso)  # yeah, it gives me the 'index', so I can find it
[] 12

The problem is that I can find these values one by one. It’s not practical. But I can find them without having to create any further function:

[1]  12   219  386   688   1209   1729   2254

Now, I know the index of each value, using certain criteria (weight > 150 Kg).

Finally, it’s up to me to decide what to do with these values: fix them? Delete them? Well, it depends. If I decide it’s only a typesetting mistake, I can fix it. If I’m unable to find the correct value, it’s better to delete the whole row.

When researching, one of the most important things to remember: be honest. Do not make a fake analysis.


Comments None

I got used to developing web applications with PHP and MySQL. I thought the only right way to use a MySQL connection was through PHP, Ruby on Rails or even Python (Django), that is, using a web application. I didn’t need another method of connecting to a MySQL server, apart from managers like PHPMyAdmin. However, since I’m currently doing my PhD, I need to dig into tables of a MySQL database in order to extract certain data. I began using bash in GNU/Debian, but I realised I needed a more versatile way, so I thought Python could help me.

There are several methods on how to connect to a MySQL server using Python. Finally, I came across the two preferred methods: MySQLdb y mysql connector.

As stated in Stack Overflow, MySQLdb is a C module that links against the MySQL protocol implementation in the libmysqlclient library. It’s faster but requires the library in order to work. However,
mysql-connector is a Python module that reimplements the MySQL protocol in Python. It is slower, but does not require the C library and so is more portable.

There’s a nice, interesting comparative at Charles Nagy. Probably, the speed differences are veery small, keeping in mind my intended purposes (the client and the server are on the same computer), so I chose MySQLdb. To install it in Debian:

sudo apt-get install python-mysqldb

The steps are:
  1. Open the connection
  2. Open the cursor
  3. Write (and execute) the query
  4. Commit the query if necessary
  5. Retrieve data if necessary
  6. Close the cursor
  7. Close the connection

Right extracted from, a little example:

import MySQLdb
conn = MySQLdb.connect (
    host = "localhost", 
    user = "testuser", 
    passwd =  "testpass", 
    db = "test")
cursor = conn.cursor ()
cursor.execute ("SELECT VERSION()")
row = cursor.fetchone ()
print "server version:", row[0]
cursor.close ()
conn.close ()

This little script just shows the version of the MySQL server. I’ll write a new post with SELECT statement. See you then.

Categories ,

Comments None

It’s been long since I wrote for the last time in this blog. I don’t know exactly why I give up writing, for I like reading and writing in English. Somewhat it’s exciting.

I started this blog several years ago when I was very involved in the development of web applications, PHP programming and CodeIgniter. I keep on programming, of course, but nowadays I’m interested in other topics, such R, Latex, medical articles publishing and my own thesis.

Another interest is English. I love English. I decided several months ago that I will try to improve my English skills. That’s why I will write all my post in English.

My blog’s last posts were written in English. I realize it’s a very poor English, but I’m trying to do my best.

I will devote this blog to development issues: R, Latex, maybe CodeIgniter. I will write my thoughts in my other blog (same CMS, same theme). I think it could be a good idea to keep them separated. Let’s see.

I imported all posts and old stuff from my old blog. From now on, it’s officially dead. Don’t be surprised if you see some posts in Spanish.

Again, welcome to my site reborn.


Comments None

Up to now, the only situation I’ve used the URL arguments in CodeIgniter was in the Pagination Class. But there are more uses of this, as you can retrieve information from your URI strings. Although I’ve been using CodeIgniter for 2 years, I’ve never had to use the $this->uri->segment(n).

Let’s see an example. Let’s type this URL:


The segment numbers would be:
  1. site
  2. main
  3. parameter_1

It’s not necessary to know how the segments work in order to access the parameters, as CodeIgniter knows which is the controller (site), which is the method (main) and which is the parameter (obviously, parameter_1).

But I came across the following problem: What happens if I have got a 4^th^ parameter I have to use? Let’s say I have:


The segment numbers would be:
  1. site
  2. main
  3. parameter_1
  4. parameter_2

When I worked with PHP, the solution was the use of GET, by using $_GET[‘parameter’], but CodeIgniter is smarter.

You can access this way:

$site = $this->uri->segment(1);
$main = $this->uri->segment(2);
$parameter_1 = $this->uri->segment(3);
$parameter_2 = $this->uri->segment(4);

So you can easily retrieve information from your URL without using GET:
$this->uri->segment(1); // controller
$this->uri->segment(2); // method
$this->uri->segment(3); // 1st segment
$this->uri->segment(4); // 2nd segment

I hope it’s useful for anyone. Bye!
More information: Stackoverflow, and URI Clas – CodeIgniter

Categories ,

Comments None

In CodeIgniter, the Form Helper file contains functions that assist in working with forms. I’ve used it to write less code. You know, the less code you write, the more productive you are.

Several posts ago I wrote an entry about dropdowns and CodeIgniter, and the different ways to do the same thing. CodeIgniter is versatile enough not only to allow different ways to code, but it also allows us to write less code so as to get better results.

My problem is that I have to insert a dropdown. As I mentioned previously (sorry, only in Spanish), I decided to use the CodeIgniter way of doing things.

But I use the Validation Helper, so I need to set a form value for re-populating the form after the re-load of the page. Set_value() has helped me to re-populate values within the input fields, but how can I do the same in a dropdown?

There are several ways, but I think things can be done with the help of CodeIgniter.

Let’s say we have this:

$colors = array('0' => 'Red',
                       '1' => 'Blue',
                       '2' => 'Green'                       
echo form_dropdown('color', $colors, '0');

As you now, the HTML will render this:

<select name="color">
<option value="0" selected="selected">Red</option>
<option value="1">Bluet</option>
<option value="2">Green</option>

I tried several times using set_value and select_value, and I must say neither of them have worked for me. So I tried what I saw at Stackoverflow, so I avoid the use of set_value or set_select.

function dropdown_colors()
	$result = $this->site_model->colors();
		foreach($result as $row)
			$colors[$row->id_color] = $row->color;
		$selected = ($this->input->post('color')) ? $this->input->post('color') : '1';
		return form_dropdown('color', $colors, $selected);

Notice two things. First of all, form_dropdown allows 4 parameters, although I’ve used only three of them:

  • The first one it’s the *name” of the select box.
  • The second one it’s the associative array of options; you can retrieve it from scratch or from a database.
  • The third one it’s the value you wish to be selected by default.

For the third parameter I chose an if…else loop. Doesn’t it look like a classic if…else loop? That’s because I used a shorthand, so I can write the statement in a single line:

($myvalue == 99) ? "x is 99": "x is not 99";

This is the same as:

if (myvalue == 99) 
    echo "x is 99";
    echo "x is not 99";

This single line statement has always the same structure:

condition ? what happens when it’s true : what happens when it’s false

I hope you’ll find it useful. Bye.

Categories ,

← Older Newer →