Open Access Short communication

Prediction Consistency of Lasso Regression Does Not Need Normal Errors

Kateřina Hlaváčková-Schindler

Journal of Advances in Mathematics and Computer Science, Page 1-7
DOI: 10.9734/BJMCS/2016/29533

Sourav Chatterjee in 2014 proved consistency of any estimator using orthogonal least squares (OLS) together with Lasso penalty under the conditions the observations are upper bounded, with normal errors, and being independent of observations, with a zero mean and a finite variance. Reviewing his elegant proof, we come to the conclusion that the prediction consistency of OLS with Lasso can be proven even with fewer assumptions, i.e., without assuming normality of the errors, knowing only they have a finite variance and zero mean. We give an upper bound on the convergence rate of OLS-Lasso estimator for these errors. This upper bound is not asymptotic and depends both on the number of regressors and on the size of the data set. Knowing the number of regressors in a regression problem, one can estimate how large data set is needed, to achieve a prediction error under a given value, and this in comparison to the cited work, without solving the parameter estimation problem for fitting the errors to a normal distribution. The result can encourage practitioners to use OLS Lasso as a convergent algorithm for prediction with other than normal errors satisfying these milder conditions.

Open Access Original Research Article

Development of an Automated Descriptive Text-based Scoring System

K. M. Adesiji, O. C. Agbonifo, A. T. Adesuyi, O. Olabode

Journal of Advances in Mathematics and Computer Science, Page 1-14
DOI: 10.9734/BJMCS/2016/27558

Computers and electronic technology today offer a very large number of ways to enrich educational assessment both in classroom and in large scale testing situations. Presently, in a large scale testing situation, scores are awarded manually. However, this system is characterized by inconsistency owing to emotional and cognitive human attributes. These can invariably damper students’ morals. Thus, a text-based scoring system based on computer technology is proposed in order to alleviate the limitations of the manual system in a large-scale testing situation. In this work, an automated descriptive text-based scoring system (ADTSS) is developed in the science and technology area. The ADTSS architecture consists three modules: the domain knowledge, text reviewer and scoring engine modules. The domain knowledge contains set of keywords that relate to terms in words, sentences that describe topic in question in the descriptive text-based system. The text reviewer appraises students’ responses, trim and format as well as maps students’ Identity to their corresponding expected responses Identity in the knowledge base. The scoring engine is divided into two components viz: the marker class and marks obtainable. The mark obtainable by student is based on Multivariate Bernoulli model. The proposed ADTSS was evaluated using the responses of 50 students in software engineering examination in Federal University of Technology Akure (FUTA). The results obtained shows 73.7% accuracy of the proposed system using mean divergence metric. The results shows that the proposed system can be used for text-based scoring because the comparative analysis between the proposed the manual scoring shows a little divergence and the problem examiner’s bias is removed.

Open Access Original Research Article

Metrics of Mediation of Binary Location on Savings - Income Relationship for Middle Income Earners in Nigeria

H. Chike Nwankwo, A. Haruna Akibu, G. Bala George

Journal of Advances in Mathematics and Computer Science, Page 1-12
DOI: 10.9734/BJMCS/2016/27707

This study established the place of location (a binary variable) as a mediating variable in the relationship between income and savings of middle income earners in Etsako East and West Local Government Areas of Edo State Nigeria. Due to distinct variability of the population and the uncertainty of its size, stratified random sampling using equal allocation was employed as our sampling design. A total of 924 valid responses (462 from each from both rural and urban middle income dwellers in the two local government areas) were selected and used for the analyses. Linear regression model and the logistic regression model were used to analyze the effects which form the basis of mediation. Because effects of different variables measured in different metrics are combined and compared, standardized coefficients and standard errors were used to test the significance of the mediated effect. Results show that Location is a partial mediator in the relationship between income and savings, and the mediated effect is statistically significant. It was also observed that about 7.5% of the total effect is mediated by location; the size of the mediated effect is 0.13 (medium). We therefore recommend that these metrics for binary mediating variables in mediation analysis should be extended to related cases where the mediating variable will be of three possible ordinal outcomes. It is also recommended that this study should be extended to a wider geographical region like the whole of Nigeria, South-South, South-West, South-East, and other geo-political zones of Nigeria.

Open Access Original Research Article

iCleaner: A Data Cleansing Tool for Outlier Detection in a Data Warehousing Environment

Kofi Sarpong Adu-Manu, John Kingsley Arthur, Joseph Kobina Panford, Joseph George Davis

Journal of Advances in Mathematics and Computer Science, Page 1-17
DOI: 10.9734/BJMCS/2016/28861

The implementation of Data Cleansing (DC) in Data Warehousing (DW) is essential in recent years. Organizations around the world generate huge amount of data from their day-to-day activities for their operations. These organizations will not survive if the data they generate remains dirty or erroneous. There are errors or outliers that make the data become dirty such as data entry errors, outdated data in the database, data migrated from old databases, and changes made at the source repository. The changing needs by customers to update their records (for example customer attributes such as marital status, phone number or address changes with time) cause records to become obsolete and reducing the quality of data. In order to obtain high quality data over time, the data requires cleansing. The proposed Integrated Cleaning (iCleaner) tool is developed to facilitate the data cleaning process and to address the problems associated with duplicated records. In addition, the proposed cleaning tool is able to detect and update missing data by merging key columns within the records. The system is flexible to use and comes with a convenient user-friendly interface designed for the data cleansing process. We provide an efficient, but simple algorithms designed to perform these functionalities and provide the running time for the system performance.

Open Access Original Research Article

Open Access Original Research Article

Stability Analysis of Deterministic Mathematical Model for Zika Virus

M. Khalid, Fareeha Sami Khan

Journal of Advances in Mathematics and Computer Science, Page 1-10
DOI: 10.9734/BJMCS/2016/29834

This research paper presents the stability analysis of infectious state of Zika Virus in many types of population in mathematical perspective. This model focus on viral activity in disease free equilibrium, in epidemic equilibrium and their R0 of epidemic under the possibility of spread due to human interaction. The constructed mathematical model is based on the data sets obtained from three regions Brazil, Cape Verde and Colombia. Results obtained validate the given conditions.

Open Access Original Research Article

A Comparative Study for Solving Nonlinear Fractional Heat -Like Equations via Elzaki Transform

Mohand M. Abdelrahim Mahgoub, Abdelilah K. Hassan Sedeeg

Journal of Advances in Mathematics and Computer Science, Page 1-12
DOI: 10.9734/BJMCS/2016/29922

In this paper, the Homotopy Perturbation Elzaki Transform Method (HPETM) and Homotopy Decomposition Method (HDM) are used to solve nonlinear fractional Heat - Like equations. Both methods are very efficient techniques and quite capable, practically for solving different kinds of linear and nonlinear fractional differential equations .The results reveal that the (HDM) has an advantage over the (HPETM) which is that it solves the nonlinear problems using only the inverse operator which is basically the fractional integral. Additionally there is no need to use any other inverse transform to find the components of the series solutions as in the case of HPETM. As a consequence the calculations involved in HDM are very simple and easy execution.