<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE ArticleSet PUBLIC "-//NLM//DTD PubMed 2.7//EN" "https://dtd.nlm.nih.gov/ncbi/pubmed/in/PubMed.dtd">
<ArticleSet>
<Article>
<Journal>
				<PublisherName>University of Kashan</PublisherName>
				<JournalTitle>Mathematics Interdisciplinary Research</JournalTitle>
				<Issn>2538-3639</Issn>
				<Volume>9</Volume>
				<Issue>3</Issue>
				<PubDate PubStatus="epublish">
					<Year>2024</Year>
					<Month>09</Month>
					<Day>01</Day>
				</PubDate>
			</Journal>
<ArticleTitle>Modified‎ ‎Step‎ ‎Size‎ ‎for‎ ‎Enhanced‎ ‎Stochastic Gradient Descent‎: ‎Convergence and Experiments</ArticleTitle>
<VernacularTitle></VernacularTitle>
			<FirstPage>237</FirstPage>
			<LastPage>253</LastPage>
			<ELocationID EIdType="pii">114474</ELocationID>
			
<ELocationID EIdType="doi">10.22052/mir.2023.253279.1426</ELocationID>
			
			<Language>EN</Language>
<AuthorList>
<Author>
					<FirstName>Mahsa</FirstName>
					<LastName>Soheil Shamaee</LastName>
<Affiliation>‎Department of  Computer Science,
‎Faculty of Mathematical Science‎, 
‎University of Kashan‎, ‎Kashan‎, ‎Iran</Affiliation>

</Author>
<Author>
					<FirstName>Sajad</FirstName>
					<LastName>Fathi Hafshejani</LastName>
<Affiliation>‎Department of Applied Mathematics, 
‎Shiraz University of Technology‎,‎
‎Shiraz‎, ‎I‎. ‎R‎. ‎Iran‎</Affiliation>

</Author>
</AuthorList>
				<PublicationType>Journal Article</PublicationType>
			<History>
				<PubDate PubStatus="received">
					<Year>2023</Year>
					<Month>07</Month>
					<Day>18</Day>
				</PubDate>
			</History>
		<Abstract>‎This paper introduces a novel approach to enhance the performance of the stochastic gradient descent (SGD) algorithm by incorporating a modified decay step size based on $\frac{1}{\sqrt{t}}$‎. ‎The proposed step size integrates a logarithmic term‎, ‎leading to the selection of smaller values in the final iterations‎. ‎Our analysis establishes a convergence rate of $O(\frac{\ln T}{\sqrt{T}})$ for smooth non-convex functions without the Polyak-Łojasiewicz condition‎. ‎To evaluate the effectiveness of our approach‎, ‎we conducted numerical experiments on image classification tasks using the Fashion-MNIST and CIFAR10 datasets‎, ‎and the results demonstrate significant improvements in accuracy‎, ‎with enhancements of $0.5\%$ and $1.4\%$ observed‎, ‎respectively‎, ‎compared to the traditional $\frac{1}{\sqrt{t}}$ step size‎. ‎The source code can be found at  https://github.com/Shamaeem/LNSQRTStepSize.</Abstract>
		<ObjectList>
			<Object Type="keyword">
			<Param Name="value">Stochastic gradient descent‎</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">‎Decay step size‎</Param>
			</Object>
			<Object Type="keyword">
			<Param Name="value">‎Convergence rate</Param>
			</Object>
		</ObjectList>
<ArchiveCopySource DocType="pdf">https://mir.kashanu.ac.ir/article_114474_dde1b8da8db0067c294448415069d8b8.pdf</ArchiveCopySource>
</Article>
</ArticleSet>
