# Category Archives: Algorithms for searching and sorting in C ++

In this section we will highlight the sorting algorithms and search data in arrays C ++

# Sorting structures. STL Part 13

Shown in the previous section various sorting - Flexible and beautiful mechanism for all occasions. That's just in practice, sort sequence almost never have a pure. This is all from the field of educational and demonstration tasks. In practice, the more often the problem is sorted quite voluminous data structures (even bulky in size, and according to the number of its fields). And sort (or resorting) they will have on the values ​​for the different fields of these same structures. But here come to the aid STL algorithms, especially when used in conjunction with the functors.

Let's see the typical model problem, which we've seen before - a description of the training group or faculty:

The program requests the field number data, on which sorting will be carried out (actually - comparison). If this number is entered, as a positive number, then sort by that field goes in ascending order. If the number is entered with a minus sign - the sort order is reversed:

Note, that the sorting algorithm does not care that sort: if the numeric data, the largest, and if lowercase, in order leksograficheskom.

# Sorting. STL (part 12)

It is a special group of sorting algorithms make - in practice it is necessary to sort a variety of objects and on a variety of criteria. An analysis of sorting algorithms are well and thoroughly studied, More than any other section of Computational Mathematics. The main issue of any sorting algorithm is its computational complexity - the number of comparison-exchange operations, required for sorting a sequence of length N, so-called O( N ).

The unpleasant fact is that, that the average complexity of different algorithms for different (on most test sequences) and the maximum complexity (at worst for a method of test sequence). For several tens of sorting algorithms, offered in learning Programming, shown, that the vast majority of them (intuitive) It is the worst in terms of average and in terms of maximum complexity. As an example,, popular among students bubble sorting It is the worst of all generally known.

Effective (complexity) the entire set of methods are only a few methods of fast recursive sorting. They are then presented to the STL implementations of algorithms. Standards do not insist on a hard constraint on domestic mechanisms for their implementation, therefore it may be possible depending on the library.

Now, having some knowledge about algorithms, functor and the general state of affairs with sorting, we are ready to consider options for implementing all this in your code. To start a variety of ways we sort numeric sequence. An example is too big, but it is not only, and as, to illustrate the, for subsequent self-experimentation:

Before compiling the program, in MVS press Alt + F7 and enter command arguments ?30 +3:

Some awkwardness example due to the fact, that we are in the same code and combine all provided STL sorting algorithms, and a variety of test sorted sequence, for example:

• Several random (?) sequences of length 30, sortable using (3) functor comparison, and detailed (+) terminal input and output sequences:

• Inverted (reverse, decreasing) sequence, sortable (6) "Heap" using a binary tree (command arguments -30 +6 ):

• long (1000000) sequence, the same, the previous case, conditions of, but only with the conclusion of correctness of the diagnosis results (command arguments -1000000 6 ):

The STL offers 3 Group sorting algorithms:

• sort() – the fastest sorting on average ( O( N * log( N ) ), but which can "fall" in the worst case to O( N * N ) (and it is a very poor indicator);

• sort_heap() – sorting "on the heap", using a binary tree, whose complexity always no worse than O( N + N * log( N ) ) (it worse sort() average, but rather the worst case);

• stable_sort() – "Stable" merge sort, which means stability, that it maintains the relative order of equal elements after sorting. It is sometimes very important. The complexity of this algorithm is much inferior sort();

Use the algorithm, which is most consistent with the constraints of your problem.

# Algorithms. STL (part 10)

STL containers would be a beautiful fiction quite far from practical use (as it was in the first years of existence of STL), if not the following fact: because of the single overall nature of container basic algorithms, of interest in practice, It can be implemented in a generalized form, applicable to any types of containers. Algorithms - is the bulk and the most demanded part of the library. Provided so many algorithms, that for a detailed description of all of them and not enough bulk book. Below we completely conditionally We divide them into groups and call by name (and also not all), and only for certain use cases construct.

for_each() The most commonly used algorithm - is for_each(): implementation of actions for the groups of elements (may all) container. The following shows a few examples of the algorithm for_each() for an array and vector, likewise, this algorithm can be used with any container STL:

Note: strings 3 It shows the operation of the new (introduced C ++ 11 standard) design for( auto &x : …), which has a similar effect assignment and can be applied to arrays and containers (in-operator function output vector in the flow shown such an option). This construction, generally speaking, It is not part of a library or algorithms STL, but it has the same effect, that algorithm for_each(): apply consistently to all elements of the collection.

This example shows the basic organization of logic all algorithms: a specified range (not necessarily to the entire collection), limited iterator start and end (often indicates the first 2 options) applied alternately function, functor, Or predicate (function, returns a result logixchesky, allowing for the selection of any ground).

find() Next algorithm significance - it find(). How intuitive behalf of, a search for an element in the collection. Note, many containers have method find(), which is an object to be called as obj.find(…), while algorithm It will be called as find( obj:iteator, … ).

Properly, it is not one, this algorithm, but a whole group of their extensive, which can be combined on the basis of, what are they selected elements of a collection on some grounds, condition, predicate: find(), find_if(), find_if_not(), find_first_of(), find_end(), adjacent_find(). In the same group, with some stretch, can be attributed count(), count_if(), search(), binary_search(), min(), max(), minmax_element(), min_element(), max_element(), equal() and etc.

Another conventional group - algorithms, in some way "who shuffle" collection, permute elements in places, changing values: fill(), replace_copy(), reverse(), rotate(), rotate_copy(), shuffle(), random_shuffle(), transform(), replace(), replace_if() and etc.

More artist - this algorithm works with 2 collections, toDrawing and moving content (moreover, perhaps, between the different types of collections, for example, vector<> in set<>): copy(), copy_if(), move(), swap_ranges(), remove_copy(), remove_copy_if(), merge(), set_intersection(), set_difference() and etc.

And, finally, very special group of algorithms is associated with a variety of sortings elements within the collection: sort(), stable_sort(), is_sorted(), is_sorted_until() and etc. This interesting group of us aside, for separate detailed consideration.

With such an abundance of implemented algorithms, and the number of which increases with time library, and that, that most of them do not really nowhere described in the literature, the question naturally arises: how to understand all this diversity? These difficulties are removed so that:

• All STL objects (containers, algorithms) described in syntax patterns (template). Therefore, their descriptions necessarily You should be included in the compiled code as part of their header files (header files).

• Go to the default directory </usr/include/c++> and find the header files of the form files stl_algo * - There you will find all functions prototypes algorithms. Furthermore, there each prototype is preceded by a thorough comment, explaining the purpose of the algorithm, and explains the call options.

• Consider the sample code, using some BASICх STL algorithms - a lot of them in the network. By analogy elementary and simply reproduce the behavior all the other algorithms.

Note: This is due to the fact, that the library of template classes are defined in terms of template, about syntax errors are a compilation). verbose, for tens of strings of messages and b). awfully vague to find bugs. This is the flip side of such a powerful mechanism as the template, and this should be ready.

As I mentioned above, case studies will remove a lot of questions, and therefore proceed to the code … Now watch carefully for hands (STL is a set of algorithms, that every comment – beyond a reasonable presentation of the volume, but they all work like one another):

Here containers used for char (as compact, but the most unpleasant work), over which run a variety of algorithms for almost all the designated groups:

# Search substring in string

Now we will look at examples, search algorithm might look for a substring within a string. Examples are based on the standard library functions, it is in these functions and exhibit all the convenience of writing programs. But the classic parsing algorithm, based on loops and comparisons, It is also quite remarkable. Therefore, we will consider it in the same lesson.

The algorithm itself is very simple in principle,. There are two strings. For example "Hello world" and "the"

Work will be in two loops:

1. The first will be to carry out a passage across the string, and look for location first letter search string ( "the" ).
2. Second, since the position of the first letter found – check, which letters are after it, and how many of them match in a row.

We illustrate searching for a substring within a string:

The first two iterations of the loop compared the letters would not be the same (marked in red). The third iteration of the desired letter (the first character of the desired word) It coincides with the character in the string, where the search. In this match the work included the second loop. It is designed to count the number of characters after the first in the search string, which will coincide with the characters in the original string. If one of the symbols do not match – Loop exits. It makes no sense to chase the loop of wasted, after the first discrepancy, since it is already clear, that there is no title.

In the third iteration matched only the first character of the string, but the second is not the same. The first loop will have to continue. The fourth iteration gives the required results – match all the characters of the string with a portion of the source string. And since all the characters matched – substring found. Algorithm work can be completed.

We will see, It looks like a classic substring search code string in C ++:

Two loops performed each his task. One stomping on the string hoping to find “head” search word (first character). The second finds, whether there is found after “head” “body” sought. And checks, whether it does not lie “body” at the end of the string. That is. is not found if the word length is one more than the required length of the string, considering, Null terminator that falls into this unit ( '' ).

We see, what program found begin substring however, in the cells of the character array index 0 and 4. But why? After all, in a word Parapapa 3 such substring. The thing '' .

In general, the meaning of the algorithm ends here. No more complications in addition to zero at the end of the string no. but, should pay attention to the multiplicity of search. What, if we need to find the string in several positions? How many times the search term occurs in the string and in what places? It is designed to control and the third parameter – int nnumber of occurrences of a string. If you put the unit back – He finds the first match title. If a deuce, it will make the first loop to skip first found, and seek a second. If three – look for the third and so on. Each search term found, occurrences of this counter is decremented. This allows us to describe the search loop:

That is, to find the first, second, third, fourth match… As long as the function does not return -1, that indicate the absence of N-tion of the title in a string.

Now, for comparison, search for a substring in a string ++ hederom string.

It is all! Loop string C ++ is provided by find(), returns the number of the cell, which begins with the body of the search string in the source string. Result:

Function find() takes second parameter symbol number, from which to start search. That is. When you find the first occurrence of, its value is incremented and find() search continues with the next character after the head found. Result:

All this is in C ++, himself class string comfortable enough to work with strings is, as strings, rather than just an array of characters.

# Sorting inserts

Another algorithm, designed to organize arrays, algorithm is Sorting inserts (Insertion Sort). This algorithm (as other, viewed on our website) simple enough. It consists of two loops (one embedded in the other). The first loop produces a passage through the array, and second – the movement of the elements to. Let's just take a look, It might look like a sort code, and already below analyze, how it works.

Algorithm Sorting inserts It can be described by the following positions:

1. Remember in a temporary variable ( buff example) the value of the current array element;
2. While elements of the left of the stored value is greater than the stored – We move them in the right position. It turns out, the previous item will take the stored cell. And the, that is before the previous – moves in turn the place of the previous. And so the items will move one after the other.
3. The movement ends with elements, If the next element, want to move, It is meaningfully less, than that, remember that in a temporary variable at the beginning of the loop.
4. Loop takes the next element, and again shifts all, which are located in front of him and big on value.

We will show visually the movement of values ​​in an array of seven elements during operation sorting inserts:

On the first iteration in a variable buffer is written with an index value of the cell 1 and the loop will check this item. There we find the number 2. It is larger than the value, is recorded in the zero cell, so the movement is not. Further, the variable buffer is written the value of the index of the cell 2 again comparing the values ​​go left, etc.. Only on the fourth iteration of the outer loop will overwrite values. Three first will be swapped with the five, and then with the four.

In this way, during sorting inserts recorded in element buff “sifted” to the beginning of array. In cases, when the element with a value less than would be found buff or the beginning of the sequence is reached – sieving stops.

A good visual illustration of the algorithm Sorting inserts is on wikipedia:

Time spent on the operation of this algorithm is entirely dependent on the initial data: the number of elements in the array and the original ordering. It's clear, that the larger the array – the longer the time necessary for its processing. Also, more time is required to sort the array in which the value is absolutely not ordered.

Algorithm Sorting inserts good for small arrays (up to several tens of elements). More efficient work, if the data of the array were previously partially sorted. If the array will be added new data (new items), algorithm can sort them as they are added (Unlike bubble sorting and sorting option). The effectiveness of the algorithm will increase significantly, if you add in the code algorithm binary search.

We also offer watch a short vidourok on computer with the analysis algorithm Sorting inserts:

# Selection sorting c ++

Meaning the sorting of selection (Selection Sort) It is to find the minimum value in the array element, and move this value to the beginning of array. It is necessary to make a reservation, that can be called in this case “the beginning” array (found where the minimum value moves). “Begining” algorithm Selection sorting with each step loop is shifted towards the tail of the array. Therefore, in the first iteration of the loop, found the minimum value is swapped with the value zero in the cell array. In the second iteration, “begining” already will point to the next (first) cell, and so on.

In fact it turns out a simple swapping of cell values array. When such an exchange does not need the shift values (overwrite) all elements of the array, to assign a minimum value in the appropriate cell. That is, the algorithm Selection sorting It does not require additional memory. Overwrite values ​​occurs immediately after finding the minimum value in an array.

The program code is quite simple, and does not require any special descriptions:

Role “beginning” it plays a counter i the outer loop. At each step, the value of the element, which counts the number of the variable, considered to be minimal. nested loop spends pass through the tail of the array, calculating the number of cells of the array with the minimum value (string 18 – ternary statement). If, after a nested loop variable min not changed, means the whole array of tail, that is processed, no minimum value, and the element “beginning” it remains in place. Otherwise – value change places with those found.

Tail processed array with each passage loop decreases and when it reaches the end of the array, he (array) will have sorted. The algorithm Selection sorting stop.

This is a great short video on the computer with the analysis the sorting of selection (Selection Sort):

Also read our following lessons devoted to sorting algorithms: bubble sorting and shaker sorting array.

# Sieve of Eratosthenes C ++

Sieve of Eratosthenes – one of the oldest algorithms, allows you to find the number, termed “simple”. That is. number, which are evenly divisible by one and only the. For example the number of 2. What natural (as much as) numbers can be divided 2, so as not to get the rest of? Just on 2 and 1. Or the number of 7. Same. No residue is divided again only by itself and one.

Enough simple algorithm even before BC came up with a cunning uncle Eratosthenes Kirensk. Greek nationality. Mathematics, astronomer, geographer. Sieve of Eratosthenes guys quite popular search algorithm.

What a special description of this algorithm is not really required. It involves two loops pass on a set of numbers. The first determines the next prime number, second inserted it strikes out all the complex numbers (according to a formula) standing after this simple. It is necessary to make a reservation, that the second loop once all complex numbers are not strikes. He strikes out the next number after a simple, that from this simple are at a certain distance. The distance is calculated by the formula: The current element in the square + value of this item. For example, if the number of 5 simple, then the next after him, which is equal to strike 5*5 = 10, later 5*5+5 = 15, then 5*5+5+5 = 20… and so on. Striking out in such a way that the number of multiples found simple. Finding a simple start with a number 2. Accordingly crossed out 2*2, 2*2+2, 2*2+2+2…

A good illustration is on the Wikipedia website:

Take the first prime number 2 (blue circle) and cross out all the numbers, which are multiples of two (blue crosses). Then take a prime number 3 (green circle) and cross out all that he is a multiple (green crosses) etc.

After, as the number removed from the list is easy to determine the next prime number – It will be the first not crossed out after the current.

Because the code itself is not very big and, I will not break it into blocks, there is nothing to it, and to specify. It looks for example so:

This code works as follows:: Requesting an array of numbers. I specifically for the convenience of one added to the array “subsidiary” in end member, The program begins to work with an array from scratch, as it is accepted to count in C ++ arrays, and a unit. For clarity, naturally. After the request is filled with an array of numbers in order. You could of course do not use an array of numbers, but simply an array of boolean (true, false), and primes considered item numbers, which will contain after sieving true, but I decided, that it is not clearly.

After, the numbers are placed in an array (1,2,3,4,5,6,7,8,9,10…n) you can start their screening. While the initial phase of all the numbers in a sieve are considered simple program. The first iteration of the loop takes first (or rather the second account) number – 2. From this number, the second loop resets the array elements, which fall under the filter condition * Number of Number of Number +. Thus the number of multiple screened after two deuces. Next, the first loop continues to run, checking the condition if(a[i]) , i.e.. whether there is a cell number, different from zero (because we zero out the complex number). Once a number is found, again have him work the second loop, zeroing after it found multiple numbers, standing after him.

And so the number of loops is screened until they reach a specified contact sifting border – to a number of simple catch.

Where are the prime numbers? huh… For example, in cryptography. Thus, in some random number generators. And of course in the universities :) In fact sieve teachers also do not shun and regularly expertly pampered student jobs “Finding a prime number”. Here with the help of such a sieve, this problem is completely solved.

# interpolates search

Guess, what the most favorite pastime of many teachers? prompt: Invent in their classrooms scary academic buzzwords, to the students hair stood on end from his sensei megaescarpment. One of these insidious little words, Throws young people in shock – interpolation. I propose to examine this marvel of mathematical thought from the simplest angle and using a minimum of code, so as not to soap eyes tons operators, entangling themselves in theory.

So, to start – what is this word is “interpolation”? interpolation – this is (if you speak the language of the student) define search area, similarity by calculating distances between a desired value and the whole area. Note the similarity of triangles in geometry? They, having the same angles meaningfully, but the proportions are different. There are practically the same principle. Calculate the length of the search area, and the length from the beginning of the area to a certain number (to say the central element in the array). The calculation of these things held as a numbered element, and with their meanings, after which the resulting length of the region multiplied by the length between the values, and the result of adding to the value of the start of the region gives the desired.

It is difficult to understand at once in words, so I try to show picture:

The formula is quite simple – calculated the length between the numbers of the first element and the desired (give the exact). The same length is considered to be between the first and last numbers. The lengths of each other are divided, just getting similarity calculation. The same happens with the values ​​of the elements – just calculated values ​​of the distance between the boundary of the array. Especially highlighted in color concepts and the associated part of the formula.

The resulting length of the array numbers is multiplied by the values ​​of the length (bordering) elements and added value in the first cell of the array.

It turns out: 1 + (20-1)/(100-1) * (200-5) = 38 kopecks.

The result is the most sought. That is. for such values ​​as in the picture in the cell №20 will stand 38. That's the whole point of interpolation – compilation of similarity between the item numbers and the values ​​between the elements.

In C ++, it looks like this:

Thus the loop itself just calculates the area using the formula array, which may be required using this same principle of interpolation in C ++, picking similarity so to speak. If the calculated is not equal to the desired, then you need to move the boundary, where the calculation of.

If calculated over – shift the right boundary of the search area, if less – the left. So cutting (how in binary search) piece by piece array gradually achieved the desired cell array, well, or the boundaries of the search area narrowed to those values, within which have nothing to look for, when the distance between the borders of equal 1 (i.e.. between the point A and have more elements to calculate) decision says, the value found in the array.

As you can see the loop itself is not so too difficult and terrible. All salt is hidden in his forum. But other – just checking: found or to seek further. Here's a she… interpolation in C ++

Interpolation search is a bit like binary. Only it is not the search region is divided into two equal parts. Instead, a new search is performed on the area of ​​assessment of the distance between the current value of the element, and the search key. The speed of the interpolation search exceeds the rate of binary. Another weighty contrast interpolating the binary search – the ability to search the text information, and not only the numerical values.

# Binary (binary) search for c++

We have already dealt with the algorithm linear search. In the same article mentioned, that is not the only algorithm, which makes it possible to find the target value of the array. Existother search algorithms. Binary (binary) search is more efficient (checked asymptotic analysis of algorithms) the decision in the case, if the array is sorted in advance.

Supposing, the array of 12 elements sorted ascending:

The user sets the desired value (search key). Let's 4. On the first iteration array It is divided into two parts (We are looking for the middle element – midd): (0 + 11) / 2 = 5 (0.5 discarded). At first, checks the value of the middle element of the array. If it coincides with a key – the algorithm terminates and the program displays the message, that value is found. In our case, the key does not match the value of the middle item.

If the key is less than the average member, the algorithm will not be searched in one half of the array, which contains a value greater than the key (i.e.. from the middle to the end of the array element). The right edge of the search moves (midd – 1). Then again on the array division 2.

The key again is not equal to the middle element. It's more him. Now the left border of the search moves (midd + 1).

The third iteration of the middle element – a cell index 3: (3 + 4) / 2 = 3. It is the key. The algorithm terminates.

Example:

Result:

The binary search algorithm is effective - his assessment of the complexity of O(log2(n)), while an ordinary incremental search O(n). It means, that for example, for an array of 1024 elements of linear search in the worst case (when the desired item is not in the array) process all 1024 elements, but a binary search is enough to handle the log2(1024) = 10 elements. This result is achieved due to the fact, that after the first step, the search area is narrowed to loop 512 elements, After the second – to 256 etc.

The disadvantages of such an algorithm is the requirement of ordering data, as well as the ability to access any data element for the continued (independent of the number of data) time. Thus, the algorithm can not work on any disordered arrays and data structures, built on the basis of linked lists.

# Linear search in C++

100% programmers, during training, sooner or later face the need to check for certain values in an array. There are several well-known search algorithms in programming languages. Now we look at the simplest of them (but not the most effective) – linear search or sequential search. Due to the fact, that search is conducted through the full brute force elements in the array and compare its value with the specified key, the algorithm is quite low speed.

To tell there is nothing special – It is better to show a linear search to work. In the example below to declare an array 50 elements and fill it using random number generator rand(). Prompt the user to enter the desired value with the keyboard and implement check for this value in our array. If a value is found in any element of the array – We display the index of the element. This is a classic example . It is hard and come up with something better to demonstrate the linear search in C ++.

Function performs a linear search is defined in strings 62-70. She returns to the program -1 in that case, if the value, is looking for a user, will not be found in the array. If the value is found – function returns the array index, wherein the value is stored.

Run:

In the absence of values ​​in an array:

After looking at the first picture, you will immediately notice, that in a cell with index 6 the desired value is found and the program shuts down, although the cells 23 and 33 it array are the same values. If you're okay with that, the index of the first element and is the result of the work program. Otherwise, the program should be finished, to find and record (for example, in a single array) all indices of cells, store the desired number (key).

Typically, a linear search is used to search in a small single array, that is not sorted. In other cases,, better and more efficient first sort the array and to use other search algorithms. For example binary (binary) Search or other.

To support our site – click on the piggy bank and choose any convenient way.