How is C++ STL list implemented internally

The previous article in this series called STLHow, showed how to implement STL Vector. This article will go into details about implementation of STL list.

Implementing STL List

The node

The STL list requires traversal in both the directions by means of forward and reverse iterator. Hence the list should be implemented as a doubly linked list. The node of the list is defined as below.

	template <typename T>
	struct ListNode
	{
		ListNode(const T &e) : data(e),
			                   next(0), prev(0)
		{ }

		T data;
		ListNode<T> *next;
		ListNode<T> *prev;
	};

 List class

	template <typename T>
	class list
	{
	public:
		list() : head_(0), tail_(0),
			     size_(0)
		{ }

		int size() { return size_; }
	private:
		ListNode<T> *head_;
		ListNode<T> *tail_;

		int size_;
	};

The list maintains pointers to head and tail of the list. Also the size of the list is stored explicitly. Because of the explicitly stored size, the size function can return the size of the list in constant time which would otherwise be O(n) had the list was required to be traversed to find out the size. The default constructor initializes all these private variables to zero.

Adding elements

	template <typename T>
	ListNode<T> *list<T>::insertInternal(const T &e, ListNode<T> *pos)
	{
		ListNode<T> *n = new ListNode<T>(e);

		size_++;
		// no operation below this should throw
		// as state of the list has changed and memory allocated

		n->next = pos;

		if(pos)
		{
			n->prev = pos->prev;
			pos->prev = n;
		}
		else
		{
			// pos is null that is at the very end of the list
			n->prev = tail_;
			tail_ = n;
		}

		if(n->prev)
		{
			n->prev->next = n;
		}
		else
		{
			// at the very begining of the list
			head_ = n;
		}

		return n;
	}

 

	template <typename T>
	void list<T>::push_back(const T &e)
	{
		// inserts before the position, 
        // 0 is the end() iterator
		// hence insert at the end
		insertInternal(e, 0);
	}

	template <typename T>
	void list<T>::push_front(const T &e)
	{
		// insert before the head
		insertInternal(e, head_);
	}

 

Removing elements

	template <typename T>
	void list<T>::removeInternal(ListNode<T> *pos)
	{
		if(pos)
		{
			if(pos->prev)
				pos->prev->next = pos->next;

			if(pos->next)
				pos->next->prev = pos->prev;

			if(pos == head_)
				head_ = pos->next;

			if(pos == tail_)
				tail_ = pos->prev;

			delete pos;

			size_--;
		}
	}

 

	template <typename T>
	void list<T>::pop_back()
	{
		removeInternal(tail_);
	}

	template <typename T>
	void list<T>::pop_front()
	{		
		removeInternal(head_);
	}

 

Accessing elements

		T &front()
		{ return head_->data; }

		T &back()
		{ return tail_->data; }

 

Destructor

	template <typename T>
	list<T>::~list()
	{
		ListNode<T> *current( head_ );

		while(current)
		{
			ListNode<T> *next( current->next );
			delete current;
			current = next;
		}
	}

 

Forward iterator

		class iterator
		{
		public:
			iterator(ListNode<T> *p=0) : pos_(p) { }
			
			T &operator*()
			{ return pos_->data; }

			T *operator->()
			{ return &(pos_->data); }

			bool operator!=(const iterator &rhs)
			{ return this->pos_ != rhs.pos_; }

			iterator operator++(int)
			{ pos_ = pos_->next; return *this; }

			iterator operator--(int)
			{ pos_ = pos_->prev; return *this; }

		private:
			ListNode<T> *pos_;
		};

		iterator begin()
		{ return iterator(head_); }

		iterator end()
		{ return iterator(0); }

 

Reverse iterator

		class reverse_iterator
		{
		public:
			reverse_iterator(ListNode<T> *p=0) : pos_(p) { }
			
			T &operator*()
			{ return pos_->data; }

			T *operator->()
			{ return &(pos_->data); }

			bool operator!=(const reverse_iterator &rhs)
			{ return this->pos_ != rhs.pos_; }

			reverse_iterator operator++(int)
			{ pos_ = pos_->prev; return *this; }

			reverse_iterator operator--(int)
			{ pos_ = pos_->next; return *this; }

		private:
			ListNode<T> *pos_;
		};

		reverse_iterator rbegin()
		{ return reverse_iterator(tail_); }

		reverse_iterator rend()
		{ return reverse_iterator(0); }

 

C++ auto_ptr for pointer to array

An auto_ptr can wrap a raw pointer but this raw pointer should not be a pointer to an array because auto_ptr will not release memory of the array when it goes out of scope. auto_ptr calls delete on the pointer when it goes out of scope. However for memory pointed to by an array pointer is to be freed then delete[] needs to be called and not delete.

How can you have an auto_ptr for a pointer to array?
Even before you try to answer that question, think whether you really need an auto_ptr for array pointer? Unless you are writing a low level library like STLHow, you most likely don’t need an auto_ptr for array pointer. That exactly is the reason why one is not included in the standard C++ library.

Instead use a Vector.

STLHow internally uses auto_ptr for pointer to array. Its implementation is listed below.
Check out STLHow – the STL that is meant to be readable.

			// an auto pointer to use with arrays
			// lets use arrays in create temp and swap idiom
			template 
			class auto_array
			{
			    public :
					auto_array(T *t) : ptr_( t )
					{ }

					~auto_array()
					{ delete[] ptr_; }

					T *operator->()
					{ return ptr_; }

					T *release()
					{ 
						T *tmp( ptr_ );
						ptr_ = 0;
						return tmp;
					}

					T &operator[](int i)
					{ return ptr_[i]; }

			    private :
					T *ptr_;
			};

How is C++ STL vector implemented internally

There are many open source implementations of STL. However it is difficult to read the STL code because it is meant to be practically usable than readable.

STLHow is an implementation of C++ Standard Template Library (STL) written in easy to understand and human readable code.

A series of articles here on codefreakr will discuss the STLHow to demonstrate “How the STL is implemented”.

Implementing STL Vector

Constructors and Destructor

Vector is implemented as a dynamically allocated array. The memory for this array is allocated in the constructor. As more elements are inserted the array is dynamically increased in size. A constructor without parameter creates an array with a default size. Another constructor with integer parameter creates an array of the specified size. The destructor deletes the memory allocated for the array.

	const int _DEFAULT_VECTOR_SIZE = 4;

	template <typename T>
	class vector
	{
	    public :

			// constructors
			vector()
				: array_( new T[_DEFAULT_VECTOR_SIZE] ),
				  reserved_size_( _DEFAULT_VECTOR_SIZE ),
				  size_( 0 )
			{ }

			vector(int n) // create vector with n default elements
				: array_( new T[n] ),
				  reserved_size_( n ),
				  size_( 0 )
    		{ }

			// destructor
			~vector()
			{ delete[] array_; }

	    private :

		    T *array_;
			int size_;
			int reserved_size_;
	};

Adding elements

The most frequently used function to add elements to vector is push_back. The function adds the element at the end of the vector ie. after the current last element. This is accomplished by putting the element at the size_th position. However that is not sufficient because vector is a dynamically increasing container hence if the currently allocated memory is not sufficient to hold the element then more memory should be allocated. So, see that there is sufficient memory to hold the element, if not allocate more memory and then insert the element.

    template <typename T>
    void vector<T>::push_back(const T &t)
    {
        // if we've run out of space, allocate more memory
        if(size_ == reserved_size_)
            resize(reserved_size_ + _DEFAULT_VECTOR_SIZE);

        // size_ when used as an index, points to the next position after
        // the last element in the vector
        array_[size_] = t;

        // now there is one more element in the vector, so increase the size
        size_++;
    }

The resize function is used to set the size of the reserved memory. Although this function is public and can be called by client code to change the actual size of the memory held by the vector it is used internally for the same purpose. Here is the implementation of the function.

    template <typename T>
    void vector<T>::resize(int n) // make the size of the internal array exactly n
    {
        if(n > reserved_size_) // if requested size is more than the current size
        {
            // allocate a new array of larger capacity
            // copy the old array to new array
            // destroy old array
            auto_array<T> new_array( new T[n] );
            for(int i=0; i<size_; i++)
            {
                new_array[i] = array_[i];
            }

            delete[] array_;
            array_ = new_array.release();
            reserved_size_ = n;
        }
    }

In the above snippet of code, auto_array is an auto_ptr like class. The standard auto_ptr has a destructor which performs delete on the pointer it holds. Hence it is unable to delete an array of objects which requires delete[] instead of delete to free them. The custom auto_ptr class is discussed here.

Forward Iterator

The forward iterator iterates through the vector elements starting from index zero and in increasing index order. Because the elements of the vector are stored in an contiguous array, a pointer of element type can function as a forward iterator. This shows that a simple pointer can work as an iterator hence it is often said that anything that behaves like an iterator is an iterator.

// iterator
typedef T* iterator;

The begin function returns the pointer to the first element in the array and the end function returns pointer to an element past the last element of the array. The end really refers to a memory location that should not be accessed as it is outside the limit of the array. This is the reason why it is advised to not to de-reference the end iterator.

// iterator functions
iterator begin()
{ return array_; }

iterator end()
{ return array_ + size_; }

There is no need to write any special iterator de-reference or access to member operator, as the pointer already has those operators defined.

Reverse Iterator

The reverse iterator iterates through the vector elements starting from the very last element and in decreasing index order. A wrapper class to the plain pointer can do the job of the reverse iterator. The operator++ should perform — and operator– should perform ++ on the plain pointer.

class reverse_iterator
{
    public:
    reverse_iterator(T *p)
    : pos(p) { }

    reverse_iterator()
    : pos(0) { }

    T &amp;operator*()
    { return *pos; }

    T *operator-&gt;()
    { return pos; }

    reverse_iterator operator++(int)
    { pos--; return *this; }

    reverse_iterator operator--(int)
    { pos++; return *this; }

    bool operator!=(const reverse_iterator &amp;rhs)
    { return this-&gt;pos != rhs.pos; }

    private:
    T *pos;
};

The rbegin function returns an object of reverse_iterator that holds pointer to the last element of the array and the rend function returns an object of reverse_iterator that holds pointer to an element before the first element of the array. The location pointed to by the rend is invalid and hence should not be de-referenced.

This article shows the internal implementation of the C++ STL vector container. Not all but only the commonly used features of the vector are described here. Rest of the features can easily be implemented based on this description.

In the next article in the series STLHow describes the implementation of C++ STL List container.

Vim : Block select on windows and Linux

Vim allows selection of a block or column of text. Then operations like cut or delete can be performed on the selected block of text.

Enter following command to start block selection mode.

Windows

ctrl-q

Linux

ctrl-v

Then select the block with arrow keys or h/j/k/l keys.

 

image credit Brian Dill

Sybase : Iterating over table without using cursor

Cursor is not the only way to iterate over data in a table in sybase. It is possible to iterate over a table without using cursor. This article shows an alternative to cursor in sybase.

If a table has a clustered index defined on a column (or set of columns) then the rows in that table are organized in a sorted order by the indexed columns. This property of the clustered index can be used to iterate over the rows of a table.

Let us consider a table as below.

order_id product_name quantity status
1 Soap 4 Unknown
2 Toothpaste 2 Unknown
3 Shampoo 3 Unknown

The table has a clustered index on order_id column. The status column needs to be updated one row at a time. This may be required if the status is received as an output of a stored procedure.

Pick up a control variable and set it to minimum possible value. In this case a variable to hold order id, set to zero. Set rowcount to 1. This is the real trick. It causes only one row to be returned for a select query.

declare @prev_order_id int
select  @prev_order_id = 0
declare @nrows int
select @nrows = 1

while @nrows > 0
begin
    set rowcount 1

Now select data from table and perform operation using the data. Also select number of rows affected from the select operation. This must be stored in another variable because value of @@rowcount may subsequently change.

    select @product = product_name, @prev_order_id = order_id from ORDER where order_id > @prev_order_id
    select @nrows = @@rowcount
    -- perform operation using @product

End the loop and set row count to zero. Setting to zero returns all the rows affected.

end

set rowcount 0

Complete listing of the code below.

declare @prev_order_id int
select  @prev_order_id = 0
declare @nrows int
select @nrows = 1

while @nrows > 0
begin
    set rowcount 1
    select @product = product_name, @prev_order_id = order_id from ORDER where order_id > @prev_order_id
    select @nrows = @@rowcount
    -- perform operation using @product
end

set rowcount 0

 

Photo credit: katerha / Foter.com / CC BY

Floating help box with jQuery UI tooltip

jQuery UI has a nice tooltip widget. This widget is used to display a theamable tooltip for elements on a web page. That is the primary and intended use of the tooltip widget. However this widget can be innovatively used to display a startup help box.

Startup helpbox is used to display a quick instruction to the user. It is intended to get user started with the website or application. This help box disappears as soon as user starts using the application by typing, clicking etc. Lets consider a simple application that gets a string from user and shows length of the string. A startup help box could say, “Enter text below to see the number of characters entered.” The help box would disappear as soon as user starts typing in the text box. There would be a help button that when hovered would show the help box again. Lets see how to create such a help box.

To use jQuery UI tooltip, include jQuery UI library in your webpage. Normally, to create a tooltip for an element, title attribute is set on the this element and the tooltip widget is initialised on this element. The tooltip appears when mouse is hovered over this element.

However the startup help box would apppear as soon as the page loads whether or not the intended element is hovered or not. Also the tooltip would disappear when user starts to interact with the page. To achieve this, set the title attribute on a completely different element. For our string length web application, set the title attribute on the help button. Set the message that should be displayed in the help box as value of the title attribute. Also create a text input box where user would enter string.

<input id="stringInput"></input>
<span id="helpbutton" title="Enter text to find out number of characters entered">?</span>

Initialise the tooltip and set position on the intended element.  This would associate the tooltip with the help button but we want to see the tooltip next to the input text box. Hence set the position of the tooltip to the right hand side of the input text box. Unlike the simple tooltip, the help box should open on page load. This is achieved by calling open on the tooltip.

$("#helpbutton").tooltip({position: {of: "#stringInput", at: "right"}});
$("#helpbutton").tooltip("open");

The help box should close when user starts to type in the input box. Call close on the tooltip on keyup event of the input box. The help box is also closed when user clicks on the input box.

$("#stringInput").keyup(function()
{
    $("#helpbutton").tooltip("close");
});
$("#stringInput").click(function()
{
    $("#helpbutton").tooltip("close");
});

That is it! A simple trick to place a tooltip on a different element and opening and closing the tooltip on page load and click event creates a nice startup help box.

 

Photo credit: Mykl Roventine / Foter.com / CC BY-NC-SA